Re: [PATCHES] Load distributed checkpoint V4.1

Heikki Linnakangas Wed, 25 Apr 2007 03:48:17 -0700

ITAGAKI Takahiro wrote:

Here is an updated version of LDC patch (V4.1).
In this release, checkpoints finishes quickly if there is a few dirty pages
in the buffer pool following the suggestion from Heikki. Thanks.

Excellent, thanks! I was just looking at the results from my test runswith version 4. I'll kick off some more tests with this version.

If the last write phase was finished more quickly than the configuration,
the next nap phase is also shorten at the same rate. For example, if we
set checkpoint_write_percent = 50% and the write phase actually finished
in 25% of checkpoint time, the duration of nap time is adjusted to
checkpoint_nap_percent * 25% / 50%.

You mean checkpoint_nap_percent * 25% * 50%, I presume, where 50% =(actual time spent in write phase)/(checkpoint_write_percent)? Soundsgood to me.

In the sync phase, we cut down the duration if there is a few files
to fsync. We assume that we have storages that throuput is at least
10 * bgwriter_all_maxpages (this is arguable). For example, when
bgwriter_delay=200ms and bgwriter_all_maxpages=5, we assume that
we can use 2MB/s of flush throughput (10 * 5page * 8kB / 200ms).
If there is 200MB of files to fsync, the duration of sync phase is
cut down to 100sec even if the duration is shorter than
checkpoint_sync_percent * checkpoint_timeout.

Sounds reasonable. 10 * bgwriter_all_maxpages is indeed quite arbitrary,but it should be enough to eliminate ridiculously long waits if there'svery little work to do. Or we could do the same thing you did with thenap phase, scaling down the time allocated for sync phase by the ratioof (actual time spent in write phase)/(checkpoint_write_percent). Usingthe same mechanism in nap and sync phases sounds like a good idea.

I use bgwriter_all_maxpages as something like 'reserved band of storage
for bgwriter' here. If there is a better name for it, please rename it.

How about checkpoint_aggressiveness? Or checkpoint_throughput? I thinkthe correct metric is (k/M)bytes/sec, making it independent ofbgwriter_delay.

Do we want the same setting to be used for bgwriter_all_maxpages? Idon't think we have a reason to believe the same value is good for both.In fact I think we should just get rid of bgwriter_all_* eventually, butas Greg Smith pointed out we need more testing before we can do that :).

There's one more optimization I'd like to have. Checkpoint scans through*all* dirty buffers and writes them out. However, some of those dirtybuffers might have been dirtied *after* the start of the checkpoint, andflushing them is a waste of I/O if they get dirtied again before thenext checkpoint. Even if they don't, it seems better to not force themto disk at checkpoint, checkpoint is heavy enough without any extra I/O.It didn't make much difference without LDC, because we tried to completethe writes as soon as possible so there wasn't a big window for that tohappen, but now that we spread out the writes it makes a lot of sense. Iwrote a quick & dirty patch to implement that, and at least in my testcase it does make some difference.


Here's results of some tests I ran with LDC v4.0:

http://community.enterprisedb.com/ldc/

Imola-164 is the a baseline run with CVS HEAD, withbgwriter_all_maxpages and bgwriter_all_percent set to zero. I'vedisabled think times in the test to make the checkpoint problem moresevere. Imola-162 is the same test with LDC patch applied. In Imola-163,bgwriter_all_maxpages was set to 10. These runs show that the patchclearly works; the response times during a checkpoint are much better.Imola-163 is even better, which demonstrates that usingWRITES_PER_ABSORB (1000) in the absence of bgwriter_all_maxpages isn't agood idea.

Imola-165 is the same as imola-163, but it has the optimization appliedI mentioned above. Only those dirty pages are written that are necessaryfor a coherent checkpoint. The results look roughly the same, exceptthat imola-165 achieves a slightly higher total TPM, and the pits in theTPM graph are slightly shallower.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [PATCHES] Load distributed checkpoint V4.1

Reply via email to