Re: [HACKERS] checkpoints after database start/immediate checkpoints
On Thu, Feb 4, 2016 at 4:42 PM, Andres Freund wrote: > On 2016-02-03 09:57:00 -0500, Robert Haas wrote: >> On Mon, Feb 1, 2016 at 7:43 PM, Andres Freund wrote: >> > I wonder if this essentially point at checkpoint_timeout being wrongly >> > defined: Currently it means we'll try to finish a checkpoint >> > (1-checkpoint_completion_target) * timeout before the next one - but >> > perhaps it should instead be that we start checkpoint_timeout * _target >> > before the next timeout? Afaics that'd work more graceful in the face of >> > restarts and forced checkpoints. >> >> There's a certain appeal to that, but at the same time it seems pretty >> wonky. Right now, you can say that a checkpoint is triggered when the >> amount of WAL reaches X or the amount of time reaches Y, but with the >> alternative definition it's a bit harder to explain what's going on >> there. > > Hm, but can you, really? We *start* a checkpoint every > checkpoint_timeout, but we only finish it after > checkpoint_completion_target * timeout, or cct * segments. I find it > pretty hard to explain that we have a gap of checkpoint_timeout, where > nothing happens, after an immediate/manual checkpoint. > > Defining it as: We try to *finish* a checkpoint every checkpoint_timeout > or checkpoint_segments/(max_wal_size/~3) actually seems simpler to > me. Then we just need to add that we start a checkpoint > checkpoint_completion_target before either of the above are reached. Hmm, I could go for that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checkpoints after database start/immediate checkpoints
On 2016-02-03 09:57:00 -0500, Robert Haas wrote: > On Mon, Feb 1, 2016 at 7:43 PM, Andres Freund wrote: > > I wonder if this essentially point at checkpoint_timeout being wrongly > > defined: Currently it means we'll try to finish a checkpoint > > (1-checkpoint_completion_target) * timeout before the next one - but > > perhaps it should instead be that we start checkpoint_timeout * _target > > before the next timeout? Afaics that'd work more graceful in the face of > > restarts and forced checkpoints. > > There's a certain appeal to that, but at the same time it seems pretty > wonky. Right now, you can say that a checkpoint is triggered when the > amount of WAL reaches X or the amount of time reaches Y, but with the > alternative definition it's a bit harder to explain what's going on > there. Hm, but can you, really? We *start* a checkpoint every checkpoint_timeout, but we only finish it after checkpoint_completion_target * timeout, or cct * segments. I find it pretty hard to explain that we have a gap of checkpoint_timeout, where nothing happens, after an immediate/manual checkpoint. Defining it as: We try to *finish* a checkpoint every checkpoint_timeout or checkpoint_segments/(max_wal_size/~3) actually seems simpler to me. Then we just need to add that we start a checkpoint checkpoint_completion_target before either of the above are reached. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checkpoints after database start/immediate checkpoints
On Mon, Feb 1, 2016 at 7:43 PM, Andres Freund wrote: > Right now it takes checkpoint_timeout till we start a checkpoint, and > checkpoint_timeout + checkpoint_timeout * checkpoint_completion_target > till we complete the first checkpoint after shutdown/forced checkpoints. > > That means a) that such checkpoint will often be bigger/more heavyweight > than the following ones, not what you want after a restart/create > database/basebackup... b) reliable benchmarks measuring steady state > have to run for even longer. > > There's some logic to that behaviour though: With a target of 0 it'd be > weird to start a checkpoint directly after startup, and even otherwise > there'll be less to write at the beginning. > > As an example: > 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint complete: wrote 186041 > buffers (71.0%); 1 transaction log file(s) added, 0 removed, 0 recycled; > sort=0.071 s, write=9.504 s, sync=0.000 s, total=9.532 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1460260 kB, estimate=1460260 kB > 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:33:08.061 CET 21361 LOG: checkpoint complete: wrote 127249 > buffers (48.5%); 0 transaction log file(s) added, 0 removed, 89 recycled; > sort=0.045 s, write=9.987 s, sync=0.000 s, total=10.007 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1187558 kB, estimate=1432989 kB > 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint complete: wrote 124649 > buffers (47.5%); 0 transaction log file(s) added, 0 removed, 69 recycled; > sort=0.048 s, write=10.160 s, sync=0.000 s, total=10.176 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1135086 kB, estimate=1337776 kB > 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint complete: wrote 123298 > buffers (47.0%); 0 transaction log file(s) added, 0 removed, 69 recycled; > sort=0.049 s, write=9.838 s, sync=0.000 s, total=9.843 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1184077 kB, estimate=1322406 kB > 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:18.052 CET 21361 LOG: checkpoint complete: wrote 118165 > buffers (45.1%); 0 transaction log file(s) added, 0 removed, 72 recycled; > sort=0.046 s, write=9.987 s, sync=0.000 s, total=9.992 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1198018 kB, estimate=1309967 kB > 2016-02-02 01:34:18.053 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:28.089 CET 21361 LOG: checkpoint complete: wrote 120814 > buffers (46.1%); 0 transaction log file(s) added, 0 removed, 73 recycled; > sort=0.051 s, write=10.020 s, sync=0.000 s, total=10.036 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1203691 kB, estimate=1299339 kB > 2016-02-02 01:34:28.090 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:39.182 CET 21361 LOG: checkpoint complete: wrote 110411 > buffers (42.1%); 0 transaction log file(s) added, 0 removed, 74 recycled; > sort=0.047 s, write=9.908 s, sync=0.000 s, total=11.092 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=1073612 kB, estimate=1276767 kB > > We wrote roughly 1/3 more wal/buffers during the first checkpoint than > following ones (And yes, the above is with an impossibly low timeout, > but that doesn't change anything). You can make that much more extreme > with larget shared buffers and larger timeouts. > > > I wonder if this essentially point at checkpoint_timeout being wrongly > defined: Currently it means we'll try to finish a checkpoint > (1-checkpoint_completion_target) * timeout before the next one - but > perhaps it should instead be that we start checkpoint_timeout * _target > before the next timeout? Afaics that'd work more graceful in the face of > restarts and forced checkpoints. There's a certain appeal to that, but at the same time it seems pretty wonky. Right now, you can say that a checkpoint is triggered when the amount of WAL reaches X or the amount of time reaches Y, but with the alternative definition it's a bit harder to explain what's going on there. Maybe we should do it anyway, but... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] checkpoints after database start/immediate checkpoints
Hi, Right now it takes checkpoint_timeout till we start a checkpoint, and checkpoint_timeout + checkpoint_timeout * checkpoint_completion_target till we complete the first checkpoint after shutdown/forced checkpoints. That means a) that such checkpoint will often be bigger/more heavyweight than the following ones, not what you want after a restart/create database/basebackup... b) reliable benchmarks measuring steady state have to run for even longer. There's some logic to that behaviour though: With a target of 0 it'd be weird to start a checkpoint directly after startup, and even otherwise there'll be less to write at the beginning. As an example: 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint complete: wrote 186041 buffers (71.0%); 1 transaction log file(s) added, 0 removed, 0 recycled; sort=0.071 s, write=9.504 s, sync=0.000 s, total=9.532 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1460260 kB, estimate=1460260 kB 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint starting: time 2016-02-02 01:33:08.061 CET 21361 LOG: checkpoint complete: wrote 127249 buffers (48.5%); 0 transaction log file(s) added, 0 removed, 89 recycled; sort=0.045 s, write=9.987 s, sync=0.000 s, total=10.007 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1187558 kB, estimate=1432989 kB 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint complete: wrote 124649 buffers (47.5%); 0 transaction log file(s) added, 0 removed, 69 recycled; sort=0.048 s, write=10.160 s, sync=0.000 s, total=10.176 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1135086 kB, estimate=1337776 kB 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint starting: time 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint complete: wrote 123298 buffers (47.0%); 0 transaction log file(s) added, 0 removed, 69 recycled; sort=0.049 s, write=9.838 s, sync=0.000 s, total=9.843 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1184077 kB, estimate=1322406 kB 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint starting: time 2016-02-02 01:34:18.052 CET 21361 LOG: checkpoint complete: wrote 118165 buffers (45.1%); 0 transaction log file(s) added, 0 removed, 72 recycled; sort=0.046 s, write=9.987 s, sync=0.000 s, total=9.992 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1198018 kB, estimate=1309967 kB 2016-02-02 01:34:18.053 CET 21361 LOG: checkpoint starting: time 2016-02-02 01:34:28.089 CET 21361 LOG: checkpoint complete: wrote 120814 buffers (46.1%); 0 transaction log file(s) added, 0 removed, 73 recycled; sort=0.051 s, write=10.020 s, sync=0.000 s, total=10.036 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1203691 kB, estimate=1299339 kB 2016-02-02 01:34:28.090 CET 21361 LOG: checkpoint starting: time 2016-02-02 01:34:39.182 CET 21361 LOG: checkpoint complete: wrote 110411 buffers (42.1%); 0 transaction log file(s) added, 0 removed, 74 recycled; sort=0.047 s, write=9.908 s, sync=0.000 s, total=11.092 s; sync files=0, longest=0.000 s, average=0.000 s; distance=1073612 kB, estimate=1276767 kB We wrote roughly 1/3 more wal/buffers during the first checkpoint than following ones (And yes, the above is with an impossibly low timeout, but that doesn't change anything). You can make that much more extreme with larget shared buffers and larger timeouts. I wonder if this essentially point at checkpoint_timeout being wrongly defined: Currently it means we'll try to finish a checkpoint (1-checkpoint_completion_target) * timeout before the next one - but perhaps it should instead be that we start checkpoint_timeout * _target before the next timeout? Afaics that'd work more graceful in the face of restarts and forced checkpoints. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers