On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs <si...@2ndquadrant.com> wrote:

> Agreed; that was the original plan, but implementation delays
> prevented the whole vision/discussion/implementation. Requirements
> from various areas include WAL rate limiting for replication, I/O rate
> limiting, hard CPU and I/O limits for security and mixed workload
> coexistence.
> I'd still like to get something on this in 9.4 that alleviates the
> replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

> The vacuum_* parameters don't allow any control over WAL production,
> which is often the limiting factor. I could, for example, introduce a
> new parameter for vacuum_cost_delay that provides a weighting for each
> new BLCKSZ chunk of WAL, then rename all parameters to a more general
> form. Or I could forget that and just press ahead with the patch as
> is, providing a cleaner interface in next release.
>> It's also interesting to wonder about the relationship to
>> CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
>> that that's *cheap* (1 test and branch) as long as nothing is pending.
>> I don't want to see a bunch of arithmetic added to it.
> Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?

