On 05.06.2013 21:16, Fujii Masao wrote:
On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas
<[email protected]>  wrote:
I propose that we do something similar, but not exactly the same. Let's have
a setting, max_wal_size, to control the max. disk space reserved for WAL.
Once that's reached (or you get close enough, so that there are still some
segments left to consume while the checkpoint runs), a checkpoint is
triggered.

What if max_wal_size is reached while the checkpoint is running? We should
change the checkpoint from spread mode to fast mode?

The checkpoint spreading code already tracks if the checkpoint is "on schedule", and it takes into account both checkpoint_timeout and checkpoint_segments. Ie. if you consume segments faster than expected, the checkpoint will speed up as well. Once checkpoint_segments is reached, the checkpoint will complete ASAP, with no delays to spread it out.

This would still work the same with max_wal_size. A new checkpoint would be started well before reaching max_wal_size, so that it has enough time to complete. If the checkpoint "falls behind", it will hurry up until it's back on schedule. If max_wal_size is reached anyway, it will complete ASAP.

Or, if max_wal_size
is hard limit, we should keep the allocation of new WAL file waiting until
the checkpoint has finished and removed some old WAL files?

I was not thinking of making it a hard limit. It would be just like checkpoint_segments from that point of view - if a checkpoint takes a long time, max_wal_size might still be exceeded.

In this proposal, the number of segments preallocated is controlled
separately from max_wal_size, so that you can set max_wal_size high, without
actually consuming that much space in normal operation. It's just a
backstop, to avoid completely filling the disk, if there's a sudden burst of
activity. The number of segments preallocated is auto-tuned, based on the
number of segments used in previous checkpoint cycles.

How is wal_keep_segments handled in your approach?

Hmm, haven't thought about that. I think a better unit to set wal_keep_segments in would also be MB, not segments. Perhaps max_wal_size should include WAL retained for wal_keep_segments, leaving less room for checkpoints. Ie. when you you set wal_keep_segments higher, a xlog-based checkpoint would be triggered earlier, because the old segments kept for replication would leave less room for new segments. And setting wal_keep_segments higher than max_wal_size would be an error.

- Heikki


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to