On Tue, Oct 31, 2017 at 2:00 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 30 October 2017 at 18:58, Simon Riggs <si...@2ndquadrant.com> wrote: >> On 30 October 2017 at 15:22, Simon Riggs <si...@2ndquadrant.com> wrote: >> >>>> You forgot to update this formula in xlog.c: >>>> distance = (2.0 + CheckPointCompletionTarget) * >>>> CheckPointDistanceEstimate; >>>> /* add 10% for good measure. */ >>>> distance *= 1.10; >>>> You can guess easily where the mistake is. >>> >>> Doh - fixed that before posting, so I must have sent the wrong patch. >>> >>> It's the 10%, right? ;-) >> >> So, there are 2 locations that mention 2.0 in xlog.c >> >> I had already fixed one, which is why I remembered editing it. >> >> The other one you mention has a detailed comment above it explaining >> why it should be 2.0 rather than 1.0, so I left it. >> >> Let me know if you still think it should be removed? I can see the >> argument both ways. >> Or maybe we need another patch to account for manual checkpoints. > > Revised patch to implement this

Here is the comment and the formula: * The reason this calculation is done from the prior checkpoint, not the * one that just finished, is that this behaves better if some checkpoint * cycles are abnormally short, like if you perform a manual checkpoint * right after a timed one. The manual checkpoint will make almost a full * cycle's worth of WAL segments available for recycling, because the * segments from the prior's prior, fully-sized checkpoint cycle are no * longer needed. However, the next checkpoint will make only few segments * available for recycling, the ones generated between the timed * checkpoint and the manual one right after that. If at the manual * checkpoint we only retained enough segments to get us to the next timed * one, and removed the rest, then at the next checkpoint we would not * have enough segments around for recycling, to get us to the checkpoint * after that. Basing the calculations on the distance from the prior redo * pointer largely fixes that problem. */ distance = (2.0 + CheckPointCompletionTarget) * CheckPointDistanceEstimate; While the mention about a manual checkpoint happening after a timed one will cause a full range of WAL segments to be recycled, it is not actually true that segments of the prior's prior checkpoint are not needed, because with your patch the segments of the prior checkpoint are getting recycled. So it seems to me that based on that the formula ought to use 1.0 instead of 2.0... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers