On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark <st...@mit.edu> wrote: > On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > >> Agreed; that was the original plan, but implementation delays >> prevented the whole vision/discussion/implementation. Requirements >> from various areas include WAL rate limiting for replication, I/O rate >> limiting, hard CPU and I/O limits for security and mixed workload >> coexistence. >> >> I'd still like to get something on this in 9.4 that alleviates the >> replication issues, leaving wider changes for later releases. > > My first reaction was that we should just have a generic I/O resource > throttling. I was only convinced this was a reasonable idea by the > replication use case. It would help me to understand the specific > situations where replication breaks down due to WAL bandwidth > starvation. Heroku has had some problems with slaves falling behind > though the immediate problems that causes is the slave filling up disk > which we could solve more directly by switching to archive mode rather > than slowing down the master. > > But I would suggest you focus on a specific use case that's > problematic so we can judge better if the implementation is really > fixing it. > >> The vacuum_* parameters don't allow any control over WAL production, >> which is often the limiting factor. I could, for example, introduce a >> new parameter for vacuum_cost_delay that provides a weighting for each >> new BLCKSZ chunk of WAL, then rename all parameters to a more general >> form. Or I could forget that and just press ahead with the patch as >> is, providing a cleaner interface in next release. >> >>> It's also interesting to wonder about the relationship to >>> CHECK_FOR_INTERRUPTS --- although I think that currently, we assume >>> that that's *cheap* (1 test and branch) as long as nothing is pending. >>> I don't want to see a bunch of arithmetic added to it. >> >> Good point. > > I think it should be possible to actually merge it into > CHECK_FOR_INTERRUPTS. Have a single global flag > io_done_since_check_for_interrupts which is set to 0 after each > CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then > CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one > in the normal case. > > In fact you could do all the arithmetic when you do the wal write. > Only set the flag if the bandwidth consumed is above the budget. Then > the flag should only ever be set when you're about to sleep. > > I would dearly love to see a generic I/O bandwidth limits so it would > be nice to see a nicely general pattern here that could be extended > even if we only target wal this release. > > I'm going to read the existing patch now, do you think it's ready to > go or did you want to do more work based on the feedback?
Well, *I* don't think this is ready to go. A WAL rate limit that only limits WAL sometimes still doesn't impress me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers