Re: [HACKERS] WAL Rate Limiting

Robert Haas Wed, 19 Feb 2014 08:05:04 -0800

On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark <st...@mit.edu> wrote:
> On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>
>> Agreed; that was the original plan, but implementation delays
>> prevented the whole vision/discussion/implementation. Requirements
>> from various areas include WAL rate limiting for replication, I/O rate
>> limiting, hard CPU and I/O limits for security and mixed workload
>> coexistence.
>>
>> I'd still like to get something on this in 9.4 that alleviates the
>> replication issues, leaving wider changes for later releases.
>
> My first reaction was that we should just have a generic I/O resource
> throttling. I was only convinced this was a reasonable idea by the
> replication use case. It would help me to understand the specific
> situations where replication breaks down due to WAL bandwidth
> starvation. Heroku has had some problems with slaves falling behind
> though the immediate problems that causes is the slave filling up disk
> which we could solve more directly by switching to archive mode rather
> than slowing down the master.
>
> But I would suggest you focus on a specific use case that's
> problematic so we can judge better if the implementation is really
> fixing it.
>
>> The vacuum_* parameters don't allow any control over WAL production,
>> which is often the limiting factor. I could, for example, introduce a
>> new parameter for vacuum_cost_delay that provides a weighting for each
>> new BLCKSZ chunk of WAL, then rename all parameters to a more general
>> form. Or I could forget that and just press ahead with the patch as
>> is, providing a cleaner interface in next release.
>>
>>> It's also interesting to wonder about the relationship to
>>> CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
>>> that that's *cheap* (1 test and branch) as long as nothing is pending.
>>> I don't want to see a bunch of arithmetic added to it.
>>
>> Good point.
>
> I think it should be possible to actually merge it into
> CHECK_FOR_INTERRUPTS. Have a single global flag
> io_done_since_check_for_interrupts which is set to 0 after each
> CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
> CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
> in the normal case.
>
> In fact you could do all the arithmetic when you do the wal write.
> Only set the flag if the bandwidth consumed is above the budget. Then
> the flag should only ever be set when you're about to sleep.
>
> I would dearly love to see a generic I/O bandwidth limits so it would
> be nice to see a nicely general pattern here that could be extended
> even if we only target wal this release.
>
> I'm going to read the existing patch now, do you think it's ready to
> go or did you want to do more work based on the feedback?


Well, *I* don't think this is ready to go.  A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

Reply via email to