Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
The main tuning knob is checkpoint_smoothing, which is defined as a fraction of the checkpoint interval (both checkpoint_timeout and checkpoint_segments are taken into account). Normally, the write phase of a checkpoint takes exactly that much time

So the question is, why in the heck would anyone want the behavior that
"checkpoints take exactly X time"??  The useful part of this whole patch
is to cap the write rate at something that doesn't interfere too much
with foreground queries.  I don't see why people wouldn't prefer
"checkpoints can take any amount of time up to the checkpoint interval,
but we do our best not to exceed Y writes/second".

Because it's easier to tune. You don't need to know how much checkpoint I/O you can tolerate. The system will use just enough I/O bandwidth to meet the deadline, but not more than that.

Basically I don't see what useful values checkpoint_smoothing would have
other than 0 and 1.  You might as well make it a bool.

Well that's one option. It feels like a good thing to be able to control how much headroom you have until the next checkpoint, but maybe we can just hardcode it close to 1. It's also good to avoid spreading the checkpoints unnecessarily, to keep recovery times lower, but you can control that with the min rate setting as well.

There's another possible strategy: keep the I/O rate constant, but vary the length of the checkpoint. checkpoint_rate allows you to do that.

But only from the lower side.

Now how would you replace checkpoint_smoothing with a max I/O rate?

I don't see why you think that's hard.  It looks to me like the
components of the decision are the same numbers in any case: you have to
estimate your progress towards checkpoint completion, your available
time till next checkpoint, and your write rate.  Then you either delay
or not.

Let me put it this way: If you define a min and a max I/O rate, when would the max I/O rate limit take effect? If there's few dirty buffers in the pool, so that you'll finish the checkpoint in time before the next one is due writing at the min rate, that's what you'll use. If there's more, you'll need to write fast enough that you'll finish the checkpoint in time, regardless of the max rate. Or would you let the next checkpoint slip and keep writing at the max rate? That seems like a footgun if someone sets it to a too low value.

Or are you thinking that we have just one setting: checkpoint_rate? You describe it as a maximum, but I've been thinking of it as a minimum because you *will* exceed it if the next checkpoint is due soon.

  Heikki Linnakangas

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?


Reply via email to