Re: [HACKERS] Redesigning checkpoint_segments

Heikki Linnakangas Wed, 05 Jun 2013 11:36:53 -0700

On 05.06.2013 21:16, Fujii Masao wrote:

On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas
<[email protected]>  wrote:

I propose that we do something similar, but not exactly the same. Let's have
a setting, max_wal_size, to control the max. disk space reserved for WAL.
Once that's reached (or you get close enough, so that there are still some
segments left to consume while the checkpoint runs), a checkpoint is
triggered.


What if max_wal_size is reached while the checkpoint is running? We should
change the checkpoint from spread mode to fast mode?

The checkpoint spreading code already tracks if the checkpoint is "onschedule", and it takes into account both checkpoint_timeout andcheckpoint_segments. Ie. if you consume segments faster than expected,the checkpoint will speed up as well. Once checkpoint_segments isreached, the checkpoint will complete ASAP, with no delays to spread it out.

This would still work the same with max_wal_size. A new checkpoint wouldbe started well before reaching max_wal_size, so that it has enough timeto complete. If the checkpoint "falls behind", it will hurry up untilit's back on schedule. If max_wal_size is reached anyway, it willcomplete ASAP.

Or, if max_wal_size
is hard limit, we should keep the allocation of new WAL file waiting until
the checkpoint has finished and removed some old WAL files?

I was not thinking of making it a hard limit. It would be just likecheckpoint_segments from that point of view - if a checkpoint takes along time, max_wal_size might still be exceeded.

In this proposal, the number of segments preallocated is controlled
separately from max_wal_size, so that you can set max_wal_size high, without
actually consuming that much space in normal operation. It's just a
backstop, to avoid completely filling the disk, if there's a sudden burst of
activity. The number of segments preallocated is auto-tuned, based on the
number of segments used in previous checkpoint cycles.


How is wal_keep_segments handled in your approach?

Hmm, haven't thought about that. I think a better unit to setwal_keep_segments in would also be MB, not segments. Perhapsmax_wal_size should include WAL retained for wal_keep_segments, leavingless room for checkpoints. Ie. when you you set wal_keep_segmentshigher, a xlog-based checkpoint would be triggered earlier, because theold segments kept for replication would leave less room for newsegments. And setting wal_keep_segments higher than max_wal_size wouldbe an error.


- Heikki


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

Reply via email to