Here is an updated version of LDC patch (V4.1). In this release, checkpoints finishes quickly if there is a few dirty pages in the buffer pool following the suggestion from Heikki. Thanks.
If the last write phase was finished more quickly than the configuration, the next nap phase is also shorten at the same rate. For example, if we set checkpoint_write_percent = 50% and the write phase actually finished in 25% of checkpoint time, the duration of nap time is adjusted to checkpoint_nap_percent * 25% / 50%. In the sync phase, we cut down the duration if there is a few files to fsync. We assume that we have storages that throuput is at least 10 * bgwriter_all_maxpages (this is arguable). For example, when bgwriter_delay=200ms and bgwriter_all_maxpages=5, we assume that we can use 2MB/s of flush throughput (10 * 5page * 8kB / 200ms). If there is 200MB of files to fsync, the duration of sync phase is cut down to 100sec even if the duration is shorter than checkpoint_sync_percent * checkpoint_timeout. I use bgwriter_all_maxpages as something like 'reserved band of storage for bgwriter' here. If there is a better name for it, please rename it. Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > I guess we're fine if we do just avoid excessive waiting per the > discussion in the next paragraph, and use a reasonable safety margin in > the default values. > > >> Should we try doing something similar for the sync phase? If there's > >> only 2 small files to fsync, there's no point sleeping for 5 minutes > >> between them just to use up the checkpoint_sync_percent budget. > > > > Hmmm... if we add a new parameter like kernel_write_throughput [kB/s] and > > clamp the maximum sleeping to size-of-segment / kernel_write_throuput (*1), > > we can avoid unnecessary sleeping in fsync phase. Do we want to have such > > a new parameter? I think we have many and many guc variables even now. > > How about using the same parameter that controls the minimum write speed > of the write-phase (the patch used bgwriter_all_maxpages, but I > suggested renaming it)? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Description: Binary data
---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org