Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-04-01 Thread Jehan-Guillaume de Rorthais
Hi,

As I'm writing a doc patch for 9.4 - 9.0, I'll discuss below on this formula
as this is the last one accepted by most of you.

On Mon, 3 Nov 2014 12:39:26 -0800
Jeff Janes jeff.ja...@gmail.com wrote:

 It looked to me that the formula, when descending from a previously
 stressed state, would be:
 
 greatest(1 + checkpoint_completion_target) * checkpoint_segments,
 wal_keep_segments) + 1 +
 2 * checkpoint_segments + 1

It lacks a closing parenthesis. I guess the formula is:

  greatest (
(1 + checkpoint_completion_target) * checkpoint_segments,
 wal_keep_segments
  ) 
  + 1 + 2 * checkpoint_segments + 1

 This assumes logs are filled evenly over a checkpoint cycle, which is
 probably not true because there is a spike in full page writes right after
 a checkpoint starts.
 
 But I didn't have a great deal of confidence in my analysis.

The only problem I have with this formula is that considering
checkpoint_completion_target ~ 1 and wal_keep_segments = 0, it becomes:

  4 * checkpoint_segments + 2

Which violate the well known, observed and default one:

  3 * checkpoint_segments + 1

A value above this formula means the system can not cope with the number of
file to flush. The doc is right about that:

   If, due to a short-term peak of log output rate, there
   are more than 3 * varnamecheckpoint_segments/varname + 1
   segment files, the unneeded segment files will be deleted

The formula is wrong in the doc when wal_keep_segments  0

 The first line reflects the number of WAL that will be retained as-is,

I agree with this files MUST be retained: the set of checkpoint_segments WALs
beeing flushed and the checkpoint_completion_target ones written in
the meantime.

 the second is the number that will be recycled for future use before starting 
 to delete them.

disagree cause the WAL files beeing written are actually consuming recycled
WALs in the meantime.

Your formula expect written files are created and recycled ones never touched,
leading to this checkpoint_segment + 1 difference between formulas.

 My reading of the code is that wal_keep_segments is computed from the
 current end of WAL (i.e the checkpoint record), not from the checkpoint
 redo point.  If I distribute the part outside the 'greatest' into both
 branches of the 'greatest', I don't get the same answer as you do for
 either branch.

So The formula, using checkpoint_completion_target=1, should be:

  greatest (
 checkpoint_segments,
 wal_keep_segments
  ) 
  + 2 * checkpoint_segments + 1

Please find attached to this email a documentation patch for 9.4 using this
formula.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index d2392b2..1ed780b 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -546,17 +546,18 @@
 
   para
There will always be at least one WAL segment file, and will normally
-   not be more than (2 + varnamecheckpoint_completion_target/varname) * varnamecheckpoint_segments/varname + 1
-   or varnamecheckpoint_segments/ + xref linkend=guc-wal-keep-segments + 1
-   files.  Each segment file is normally 16 MB (though this size can be
+   not be more than greatest(varnamecheckpoint_segments/, xref linkend=guc-wal-keep-segments)
+   + (1 + varnamecheckpoint_completion_target/varname) * varnamecheckpoint_segments/varname
+   + 1 files.  Each segment file is normally 16 MB (though this size can be
altered when building the server).  You can use this to estimate space
requirements for acronymWAL/acronym.
Ordinarily, when old log segment files are no longer needed, they
are recycled (that is, renamed to become future segments in the numbered
sequence). If, due to a short-term peak of log output rate, there
-   are more than 3 * varnamecheckpoint_segments/varname + 1
-   segment files, the unneeded segment files will be deleted instead
-   of recycled until the system gets back under this limit.
+   are more than greatest(varnamecheckpoint_segments/, varnamewal_keep_segments/varname)
+   + 2 * varnamecheckpoint_segments/varname + 1 segment files, the
+   unneeded segment files will be deleted instead of recycled until the system
+   gets back under this limit.
   /para
 
   para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-04-01 Thread Fujii Masao
On Wed, Apr 1, 2015 at 7:00 PM, Jehan-Guillaume de Rorthais
j...@dalibo.com wrote:
 Hi,

 As I'm writing a doc patch for 9.4 - 9.0, I'll discuss below on this formula
 as this is the last one accepted by most of you.

 On Mon, 3 Nov 2014 12:39:26 -0800
 Jeff Janes jeff.ja...@gmail.com wrote:

 It looked to me that the formula, when descending from a previously
 stressed state, would be:

 greatest(1 + checkpoint_completion_target) * checkpoint_segments,
 wal_keep_segments) + 1 +
 2 * checkpoint_segments + 1

 It lacks a closing parenthesis. I guess the formula is:

   greatest (
 (1 + checkpoint_completion_target) * checkpoint_segments,
  wal_keep_segments
   )
   + 1 + 2 * checkpoint_segments + 1

 This assumes logs are filled evenly over a checkpoint cycle, which is
 probably not true because there is a spike in full page writes right after
 a checkpoint starts.

 But I didn't have a great deal of confidence in my analysis.

 The only problem I have with this formula is that considering
 checkpoint_completion_target ~ 1 and wal_keep_segments = 0, it becomes:

   4 * checkpoint_segments + 2

 Which violate the well known, observed and default one:

   3 * checkpoint_segments + 1

 A value above this formula means the system can not cope with the number of
 file to flush. The doc is right about that:

If, due to a short-term peak of log output rate, there
are more than 3 * varnamecheckpoint_segments/varname + 1
segment files, the unneeded segment files will be deleted

 The formula is wrong in the doc when wal_keep_segments  0

 The first line reflects the number of WAL that will be retained as-is,

 I agree with this files MUST be retained: the set of checkpoint_segments WALs
 beeing flushed and the checkpoint_completion_target ones written in
 the meantime.

 the second is the number that will be recycled for future use before starting
 to delete them.

 disagree cause the WAL files beeing written are actually consuming recycled
 WALs in the meantime.

 Your formula expect written files are created and recycled ones never touched,
 leading to this checkpoint_segment + 1 difference between formulas.

 My reading of the code is that wal_keep_segments is computed from the
 current end of WAL (i.e the checkpoint record), not from the checkpoint
 redo point.  If I distribute the part outside the 'greatest' into both
 branches of the 'greatest', I don't get the same answer as you do for
 either branch.

 So The formula, using checkpoint_completion_target=1, should be:

   greatest (
  checkpoint_segments,
  wal_keep_segments
   )
   + 2 * checkpoint_segments + 1

No. Please imagine how many WAL files can exist at the end of checkpoint.
At the end of checkpoint, we have to leave all the WAL files which were
generated since the starting point of previous checkpoint for the future
crash recovery. The number of these WAL files is

(1 + checkpoint_completion_target) * checkpoint_segments

or

wal_keep_segments

In addition to these files, at the end of checkpoint, old WAL files which were
generated before the starting point of previous checkpoint are recycled.
The number of these WAL files is at most

2 * checkpoint_segments + 1

Note that *usually* there are not such many WAL files at the end of
checkpoint. But this can happen after the peak of WAL logging generates
too much WAL files.

So the sum of those is the right formula, i.e.,

(3 + checkpoint_completion_target) * checkpoint_segments + 1

   or

wal_keep_segments + 2 * checkpoint_segments + 1

If checkpoint_completion_target is 1 and wal_keep_segments is 0,
it can become 4 * checkpoint_segments + 1.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-03-31 Thread Jehan-Guillaume de Rorthais
On Tue, 3 Mar 2015 11:15:13 -0500
Bruce Momjian br...@momjian.us wrote:

 On Tue, Oct 14, 2014 at 01:21:53PM -0400, Bruce Momjian wrote:
  On Tue, Oct 14, 2014 at 09:20:22AM -0700, Jeff Janes wrote:
   On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian br...@momjian.us wrote:
   
   
   I looked into this, and came up with more questions.  Why is
   checkpoint_completion_target involved in the total number of WAL
   segments?  If checkpoint_completion_target is 0.5 (the default), the
   calculation is:
   
           (2 + 0.5) * checkpoint_segments + 1
   
   while if it is 0.9, it is:
   
           (2 + 0.9) * checkpoint_segments + 1
   
   Is this trying to estimate how many WAL files are going to be created
   during the checkpoint?  If so, wouldn't it be (1 +
   checkpoint_completion_target), not 2 +.  My logic is you have the
   old WAL files being checkpointed (that's the 1), plus you have new WAL
   files being created during the checkpoint, which would be
   checkpoint_completion_target * checkpoint_segments, plus one for the
   current WAL file.
   
   
   WAL is not eligible to be recycled until there have been 2 successful
   checkpoints.
   
   So at the end of a checkpoint, you have 1 cycle of WAL which has just
   become eligible for recycling,
   1 cycle of WAL which is now expendable but which is kept anyway, and
   checkpoint_completion_target worth of WAL which has occurred while the
   checkpoint was occurring and is still needed for crash recovery.
  
  OK, so based on this analysis, what is the right calculation?  This?
  
  (1 + checkpoint_completion_target) * checkpoint_segments + 1 +
  max(wal_keep_segments, checkpoint_segments)
 
 Now that we have min_wal_size and max_wal_size in 9.5, I don't see any
 value to figuring out the proper formula for backpatching.

I guess it worth backpatching the documentation as 9.4 - 9.1 will be supported
for somes the next 4 years

-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-03-31 Thread Jehan-Guillaume de Rorthais
On Tue, 31 Mar 2015 08:24:15 +0200
Jehan-Guillaume de Rorthais j...@dalibo.com wrote:

 On Tue, 3 Mar 2015 11:15:13 -0500
 Bruce Momjian br...@momjian.us wrote:
 
  On Tue, Oct 14, 2014 at 01:21:53PM -0400, Bruce Momjian wrote:
   On Tue, Oct 14, 2014 at 09:20:22AM -0700, Jeff Janes wrote:
On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian br...@momjian.us
wrote:


I looked into this, and came up with more questions.  Why is
checkpoint_completion_target involved in the total number of WAL
segments?  If checkpoint_completion_target is 0.5 (the default), the
calculation is:

        (2 + 0.5) * checkpoint_segments + 1

while if it is 0.9, it is:

        (2 + 0.9) * checkpoint_segments + 1

Is this trying to estimate how many WAL files are going to be
created during the checkpoint?  If so, wouldn't it be (1 +
checkpoint_completion_target), not 2 +.  My logic is you have the
old WAL files being checkpointed (that's the 1), plus you have new WAL
files being created during the checkpoint, which would be
checkpoint_completion_target * checkpoint_segments, plus one for the
current WAL file.


WAL is not eligible to be recycled until there have been 2 successful
checkpoints.

So at the end of a checkpoint, you have 1 cycle of WAL which has just
become eligible for recycling,
1 cycle of WAL which is now expendable but which is kept anyway, and
checkpoint_completion_target worth of WAL which has occurred while the
checkpoint was occurring and is still needed for crash recovery.
   
   OK, so based on this analysis, what is the right calculation?  This?
   
 (1 + checkpoint_completion_target) * checkpoint_segments + 1 +
 max(wal_keep_segments, checkpoint_segments)
  
  Now that we have min_wal_size and max_wal_size in 9.5, I don't see any
  value to figuring out the proper formula for backpatching.
 
 I guess it worth backpatching the documentation as 9.4 - 9.1 will be
 supported for somes the next 4 years

Sorry, lack of caffeine this morning. Fired the mail before correcting
and finishing it:

I guess it worth backpatching the documentation as 9.4 - 9.1 will be supported
for some more years.

I'll give it a try this week.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-03-03 Thread Bruce Momjian
On Tue, Oct 14, 2014 at 01:21:53PM -0400, Bruce Momjian wrote:
 On Tue, Oct 14, 2014 at 09:20:22AM -0700, Jeff Janes wrote:
  On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian br...@momjian.us wrote:
  
  
  I looked into this, and came up with more questions.  Why is
  checkpoint_completion_target involved in the total number of WAL
  segments?  If checkpoint_completion_target is 0.5 (the default), the
  calculation is:
  
          (2 + 0.5) * checkpoint_segments + 1
  
  while if it is 0.9, it is:
  
          (2 + 0.9) * checkpoint_segments + 1
  
  Is this trying to estimate how many WAL files are going to be created
  during the checkpoint?  If so, wouldn't it be (1 +
  checkpoint_completion_target), not 2 +.  My logic is you have the old
  WAL files being checkpointed (that's the 1), plus you have new WAL
  files being created during the checkpoint, which would be
  checkpoint_completion_target * checkpoint_segments, plus one for the
  current WAL file.
  
  
  WAL is not eligible to be recycled until there have been 2 successful
  checkpoints.
  
  So at the end of a checkpoint, you have 1 cycle of WAL which has just become
  eligible for recycling,
  1 cycle of WAL which is now expendable but which is kept anyway, and
  checkpoint_completion_target worth of WAL which has occurred while the
  checkpoint was occurring and is still needed for crash recovery.
 
 OK, so based on this analysis, what is the right calculation?  This?
 
   (1 + checkpoint_completion_target) * checkpoint_segments + 1 +
   max(wal_keep_segments, checkpoint_segments)

Now that we have min_wal_size and max_wal_size in 9.5, I don't see any
value to figuring out the proper formula for backpatching.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-12-30 Thread Guillaume Lelarge
Sorry for my very late answer. It's been a tough month.

2014-11-27 0:00 GMT+01:00 Bruce Momjian br...@momjian.us:

 On Mon, Nov  3, 2014 at 12:39:26PM -0800, Jeff Janes wrote:
  It looked to me that the formula, when descending from a previously
 stressed
  state, would be:
 
  greatest(1 + checkpoint_completion_target) * checkpoint_segments,
  wal_keep_segments) + 1 +
  2 * checkpoint_segments + 1

 I don't think we can assume checkpoint_completion_target is at all
 reliable enough to base a maximum calculation on, assuming anything
 above the maximum is cause of concern and something to inform the admins
 about.

 Assuming checkpoint_completion_target is 1 for maximum purposes, how
 about:

 max(2 * checkpoint_segments, wal_keep_segments) + 2 *
 checkpoint_segments + 2


Seems something I could agree on. At least, it makes sense, and it works
for my customers. Although I'm wondering why + 2, and not + 1. It seems
Jeff and you agree on this, so I may have misunderstood something.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-12-30 Thread Jeff Janes
On Tue, Dec 30, 2014 at 12:35 AM, Guillaume Lelarge guilla...@lelarge.info
wrote:

 Sorry for my very late answer. It's been a tough month.

 2014-11-27 0:00 GMT+01:00 Bruce Momjian br...@momjian.us:

 On Mon, Nov  3, 2014 at 12:39:26PM -0800, Jeff Janes wrote:
  It looked to me that the formula, when descending from a previously
 stressed
  state, would be:
 
  greatest(1 + checkpoint_completion_target) * checkpoint_segments,
  wal_keep_segments) + 1 +
  2 * checkpoint_segments + 1

 I don't think we can assume checkpoint_completion_target is at all
 reliable enough to base a maximum calculation on, assuming anything
 above the maximum is cause of concern and something to inform the admins
 about.

 Assuming checkpoint_completion_target is 1 for maximum purposes, how
 about:

 max(2 * checkpoint_segments, wal_keep_segments) + 2 *
 checkpoint_segments + 2


 Seems something I could agree on. At least, it makes sense, and it works
 for my customers. Although I'm wondering why + 2, and not + 1. It seems
 Jeff and you agree on this, so I may have misunderstood something.


From hazy memory, one +1 comes from the currently active WAL file, which
exists but is not counted towards either wal_keep_segments nor towards
recycled files.  And the other +1 comes from the formula for how many
recycled files to retain, which explicitly has a +1 in it.

Cheers,

Jeff


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-12-30 Thread Guillaume Lelarge
2014-12-30 18:45 GMT+01:00 Jeff Janes jeff.ja...@gmail.com:

 On Tue, Dec 30, 2014 at 12:35 AM, Guillaume Lelarge 
 guilla...@lelarge.info wrote:

 Sorry for my very late answer. It's been a tough month.

 2014-11-27 0:00 GMT+01:00 Bruce Momjian br...@momjian.us:

 On Mon, Nov  3, 2014 at 12:39:26PM -0800, Jeff Janes wrote:
  It looked to me that the formula, when descending from a previously
 stressed
  state, would be:
 
  greatest(1 + checkpoint_completion_target) * checkpoint_segments,
  wal_keep_segments) + 1 +
  2 * checkpoint_segments + 1

 I don't think we can assume checkpoint_completion_target is at all
 reliable enough to base a maximum calculation on, assuming anything
 above the maximum is cause of concern and something to inform the admins
 about.

 Assuming checkpoint_completion_target is 1 for maximum purposes, how
 about:

 max(2 * checkpoint_segments, wal_keep_segments) + 2 *
 checkpoint_segments + 2


 Seems something I could agree on. At least, it makes sense, and it works
 for my customers. Although I'm wondering why + 2, and not + 1. It seems
 Jeff and you agree on this, so I may have misunderstood something.


 From hazy memory, one +1 comes from the currently active WAL file, which
 exists but is not counted towards either wal_keep_segments nor towards
 recycled files.  And the other +1 comes from the formula for how many
 recycled files to retain, which explicitly has a +1 in it.



OK, that seems much better. Thanks, Jeff.



-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-11-26 Thread Bruce Momjian
On Mon, Nov  3, 2014 at 12:39:26PM -0800, Jeff Janes wrote:
 It looked to me that the formula, when descending from a previously stressed
 state, would be:
 
 greatest(1 + checkpoint_completion_target) * checkpoint_segments,
 wal_keep_segments) + 1 + 
 2 * checkpoint_segments + 1 

I don't think we can assume checkpoint_completion_target is at all
reliable enough to base a maximum calculation on, assuming anything
above the maximum is cause of concern and something to inform the admins
about.

Assuming checkpoint_completion_target is 1 for maximum purposes, how
about:

max(2 * checkpoint_segments, wal_keep_segments) + 2 * 
checkpoint_segments + 2

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-11-03 Thread Jeff Janes
On Wed, Oct 15, 2014 at 1:11 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge guilla...@lelarge.info
  wrote:

 Hi,

 As part of our monitoring work for our customers, we stumbled upon an
 issue with our customers' servers who have a wal_keep_segments setting
 higher than 0.

 We have a monitoring script that checks the number of WAL files in the
 pg_xlog directory, according to the setting of three parameters
 (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
 We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )


 I think the first bug is even having this formula in the documentation to
 start with, and in trying to use it.

 and will normally not be more than...

 This may be normal for a toy system.  I think that the normal state for
 any system worth monitoring is that it has had load spikes at some point in
 the past.

 So it is the next part of the doc, which describes how many segments it
 climbs back down to upon recovering from a spike, which is the important
 one.  And that doesn't mention wal_keep_segments at all, which surely
 cannot be correct.

 I will try to independently derive the correct formula from the code, as
 you did, without looking too much at your derivation  first, and see if we
 get the same answer.


It looked to me that the formula, when descending from a previously
stressed state, would be:

greatest(1 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments) + 1 +
2 * checkpoint_segments + 1

This assumes logs are filled evenly over a checkpoint cycle, which is
probably not true because there is a spike in full page writes right after
a checkpoint starts.

But I didn't have a great deal of confidence in my analysis.

The first line reflects the number of WAL that will be retained as-is, the
second is the number that will be recycled for future use before starting
to delete them.

My reading of the code is that wal_keep_segments is computed from the
current end of WAL (i.e the checkpoint record), not from the checkpoint
redo point.  If I distribute the part outside the 'greatest' into both
branches of the 'greatest', I don't get the same answer as you do for
either branch.

Then I started wondering if the number we keep for recycling is a good
choice, anyway.  2 * checkpoint_segments + 1 seems pretty large.  But then
again, given that we've reached the high-water-mark once, how unlikely are
we to reach it again?

Cheers,

Jeff


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-11-02 Thread Guillaume Lelarge
Hi,

Le 15 oct. 2014 22:25, Guillaume Lelarge guilla...@lelarge.info a écrit
:

 2014-10-15 22:11 GMT+02:00 Jeff Janes jeff.ja...@gmail.com:

 On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge 
guilla...@lelarge.info wrote:

 Hi,

 As part of our monitoring work for our customers, we stumbled upon an
issue with our customers' servers who have a wal_keep_segments setting
higher than 0.

 We have a monitoring script that checks the number of WAL files in the
pg_xlog directory, according to the setting of three parameters
(checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )


 I think the first bug is even having this formula in the documentation
to start with, and in trying to use it.


 I agree. But we have customers asking how to compute the right size for
their WAL file system partitions. Right size is usually a euphemism for
smallest size, and they usually tend to get it wrong, leading to huge
issues. And I'm not even speaking of monitoring, and alerting.

 A way to avoid this issue is probably to erase the formula from the
documentation, and find a new way to explain them how to size their
partitions for WALs.

 Monitoring is another matter, and I don't really think a monitoring
solution should count the WAL files. What actually really matters is the
database availability, and that is covered with having enough disk space in
the WALs partition.

 and will normally not be more than...

 This may be normal for a toy system.  I think that the normal state
for any system worth monitoring is that it has had load spikes at some
point in the past.


 Agreed.


 So it is the next part of the doc, which describes how many segments it
climbs back down to upon recovering from a spike, which is the important
one.  And that doesn't mention wal_keep_segments at all, which surely
cannot be correct.


 Agreed too.


 I will try to independently derive the correct formula from the code, as
you did, without looking too much at your derivation  first, and see if we
get the same answer.


 Thanks. I look forward reading what you found.

 What seems clear to me right now is that no one has a sane explanation of
the formula. Though yours definitely made sense, it didn't seem to be what
the code does.


Did you find time to work on this? Any news?

Thanks.


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-15 Thread Fujii Masao
On Fri, Aug 8, 2014 at 4:08 PM, Guillaume Lelarge
guilla...@lelarge.info wrote:
 Hi,

 As part of our monitoring work for our customers, we stumbled upon an issue
 with our customers' servers who have a wal_keep_segments setting higher than
 0.

 We have a monitoring script that checks the number of WAL files in the
 pg_xlog directory, according to the setting of three parameters
 (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
 We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )

 And we have lots of alerts from the script for customers who set their
 wal_keep_segments setting higher than 0.

 So we started to question this sentence of the documentation:

 There will always be at least one WAL segment file, and will normally not be
 more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 or
 checkpoint_segments + wal_keep_segments + 1 files.

 (http://www.postgresql.org/docs/9.3/static/wal-configuration.html)

 While doing some tests, it appears it would be more something like:

 wal_keep_segments + (2 + checkpoint_completion_target) * checkpoint_segments
 + 1

 But after reading the source code (src/backend/access/transam/xlog.c), the
 right formula seems to be:

 wal_keep_segments + 2 * checkpoint_segments + 1

 Here is how we went to this formula...

 CreateCheckPoint(..) is responsible, among other things, for deleting and
 recycling old WAL files. From src/backend/access/transam/xlog.c, master
 branch, line 8363:

 /*
  * Delete old log files (those no longer needed even for previous
  * checkpoint or the standbys in XLOG streaming).
  */
 if (_logSegNo)
 {
 KeepLogSeg(recptr, _logSegNo);
 _logSegNo--;
 RemoveOldXlogFiles(_logSegNo, recptr);
 }

 KeepLogSeg(...) function takes care of wal_keep_segments. From
 src/backend/access/transam/xlog.c, master branch, line 8792:

 /* compute limit for wal_keep_segments first */
 if (wal_keep_segments  0)
 {
 /* avoid underflow, don't go below 1 */
 if (segno = wal_keep_segments)
 segno = 1;
 else
 segno = segno - wal_keep_segments;
 }

 IOW, the segment number (segno) is decremented according to the setting of
 wal_keep_segments. segno is then sent back to CreateCheckPoint(...) via
 _logSegNo. The RemoveOldXlogFiles() gets this segment number so that it can
 remove or recycle all files before this segment number. This function gets
 the number of WAL files to recycle with the XLOGfileslop constant, which is
 defined as:

 /*
  * XLOGfileslop is the maximum number of preallocated future XLOG segments.
  * When we are done with an old XLOG segment file, we will recycle it as a
  * future XLOG segment as long as there aren't already XLOGfileslop future
  * segments; else we'll delete it.  This could be made a separate GUC
  * variable, but at present I think it's sufficient to hardwire it as
  * 2*CheckPointSegments+1.  Under normal conditions, a checkpoint will free
  * no more than 2*CheckPointSegments log segments, and we want to recycle
 all
  * of them; the +1 allows boundary cases to happen without wasting a
  * delete/create-segment cycle.
  */
 #define XLOGfileslop(2*CheckPointSegments + 1)

 (in src/backend/access/transam/xlog.c, master branch, line 100)

 IOW, PostgreSQL will keep wal_keep_segments WAL files before the current WAL
 file, and then there may be 2*CheckPointSegments + 1 recycled ones. Hence
 the formula:

 wal_keep_segments + 2 * checkpoint_segments + 1

 And this is what we usually find in our customers' servers. We may find more
 WAL files, depending on the write activity of the cluster, but in average,
 we get this number of WAL files.

 AFAICT, the documentation is wrong about the usual number of WAL files in
 the pg_xlog directory. But I may be wrong, in which case, the documentation
 isn't clear enough for me, and should be fixed so that others can't
 misinterpret it like I may have done.

 Any comments? did I miss something, or should we fix the documentation?

I think you're right. The correct formula of the number of WAL files in
pg_xlog seems to be

(3 + checkpoint_completion_target) * checkpoint_segments + 1

or

wal_keep_segments + 2 * checkpoint_segments + 1


Why? At the end of checkpoint, the WAL files which were generated since the
start of previous checkpoint cannot be removed and must remain in pg_xlog.
The number of them is

(1 + checkpoint_completion_target) * checkpoint_segments

or

wal_keep_segments

Also, at the end of checkpoint, as you pointed out, if there are
*many* enough old WAL files, 2 * checkpoint_segments + 1 WAL files will be
recycled. Then checkpoint_segments WAL files will be consumed till the end of
next checkpoint. But since there are already 2 * checkpoint_segments + 1
recycled WAL files, no more files are increased. So, WAL files that we cannot
remove and can recycle at the 

Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-15 Thread Jeff Janes
On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge guilla...@lelarge.info
wrote:

 Hi,

 As part of our monitoring work for our customers, we stumbled upon an
 issue with our customers' servers who have a wal_keep_segments setting
 higher than 0.

 We have a monitoring script that checks the number of WAL files in the
 pg_xlog directory, according to the setting of three parameters
 (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
 We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )


I think the first bug is even having this formula in the documentation to
start with, and in trying to use it.

and will normally not be more than...

This may be normal for a toy system.  I think that the normal state for
any system worth monitoring is that it has had load spikes at some point in
the past.

So it is the next part of the doc, which describes how many segments it
climbs back down to upon recovering from a spike, which is the important
one.  And that doesn't mention wal_keep_segments at all, which surely
cannot be correct.

I will try to independently derive the correct formula from the code, as
you did, without looking too much at your derivation  first, and see if we
get the same answer.

Cheers,

Jeff


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-15 Thread Guillaume Lelarge
2014-10-15 22:11 GMT+02:00 Jeff Janes jeff.ja...@gmail.com:

 On Fri, Aug 8, 2014 at 12:08 AM, Guillaume Lelarge guilla...@lelarge.info
  wrote:

 Hi,

 As part of our monitoring work for our customers, we stumbled upon an
 issue with our customers' servers who have a wal_keep_segments setting
 higher than 0.

 We have a monitoring script that checks the number of WAL files in the
 pg_xlog directory, according to the setting of three parameters
 (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
 We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )


 I think the first bug is even having this formula in the documentation to
 start with, and in trying to use it.


I agree. But we have customers asking how to compute the right size for
their WAL file system partitions. Right size is usually a euphemism for
smallest size, and they usually tend to get it wrong, leading to huge
issues. And I'm not even speaking of monitoring, and alerting.

A way to avoid this issue is probably to erase the formula from the
documentation, and find a new way to explain them how to size their
partitions for WALs.

Monitoring is another matter, and I don't really think a monitoring
solution should count the WAL files. What actually really matters is the
database availability, and that is covered with having enough disk space in
the WALs partition.

and will normally not be more than...

 This may be normal for a toy system.  I think that the normal state for
 any system worth monitoring is that it has had load spikes at some point in
 the past.


Agreed.


 So it is the next part of the doc, which describes how many segments it
 climbs back down to upon recovering from a spike, which is the important
 one.  And that doesn't mention wal_keep_segments at all, which surely
 cannot be correct.


Agreed too.


 I will try to independently derive the correct formula from the code, as
 you did, without looking too much at your derivation  first, and see if we
 get the same answer.


Thanks. I look forward reading what you found.

What seems clear to me right now is that no one has a sane explanation of
the formula. Though yours definitely made sense, it didn't seem to be what
the code does.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-15 Thread Josh Berkus
On 10/15/2014 01:25 PM, Guillaume Lelarge wrote:
 Monitoring is another matter, and I don't really think a monitoring
 solution should count the WAL files. What actually really matters is the
 database availability, and that is covered with having enough disk space in
 the WALs partition.

If we don't count the WAL files, though, that eliminates the best way to
detecting when archiving is failing.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-14 Thread Jeff Janes
On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian br...@momjian.us wrote:


 I looked into this, and came up with more questions.  Why is
 checkpoint_completion_target involved in the total number of WAL
 segments?  If checkpoint_completion_target is 0.5 (the default), the
 calculation is:

 (2 + 0.5) * checkpoint_segments + 1

 while if it is 0.9, it is:

 (2 + 0.9) * checkpoint_segments + 1

 Is this trying to estimate how many WAL files are going to be created
 during the checkpoint?  If so, wouldn't it be (1 +
 checkpoint_completion_target), not 2 +.  My logic is you have the old
 WAL files being checkpointed (that's the 1), plus you have new WAL
 files being created during the checkpoint, which would be
 checkpoint_completion_target * checkpoint_segments, plus one for the
 current WAL file.


WAL is not eligible to be recycled until there have been 2 successful
checkpoints.

So at the end of a checkpoint, you have 1 cycle of WAL which has just
become eligible for recycling,
1 cycle of WAL which is now expendable but which is kept anyway, and
checkpoint_completion_target worth of WAL which has occurred while the
checkpoint was occurring and is still needed for crash recovery.

I don't really understand the point of this way of doing things.  I guess
it is because the control file contains two redo pointers, one for the last
checkpoint, and one for the previous to that checkpoint, and if recovery
finds that it can't use the most recent one it tries the ones before that.
Why?  Beats me.  If we are worried about the control file getting a corrupt
redo pointer, it seems like we would record the last one twice, rather than
recording two different ones once each.  And if the in-memory version got
corrupted before being written to the file, I really doubt anything is
going to save your bacon at that point.

I've never seen a case where recovery couldn't use the last recorded good
checkpoint, so instead used the previous one, and was successful at it.
But then again I haven't seen all possible crashes.

This is based on memory from the last time I looked into this, I haven't
re-verified it so could be wrong or obsolete.

Cheers,

Jeff


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-14 Thread Bruce Momjian
On Tue, Oct 14, 2014 at 09:20:22AM -0700, Jeff Janes wrote:
 On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian br...@momjian.us wrote:
 
 
 I looked into this, and came up with more questions.  Why is
 checkpoint_completion_target involved in the total number of WAL
 segments?  If checkpoint_completion_target is 0.5 (the default), the
 calculation is:
 
         (2 + 0.5) * checkpoint_segments + 1
 
 while if it is 0.9, it is:
 
         (2 + 0.9) * checkpoint_segments + 1
 
 Is this trying to estimate how many WAL files are going to be created
 during the checkpoint?  If so, wouldn't it be (1 +
 checkpoint_completion_target), not 2 +.  My logic is you have the old
 WAL files being checkpointed (that's the 1), plus you have new WAL
 files being created during the checkpoint, which would be
 checkpoint_completion_target * checkpoint_segments, plus one for the
 current WAL file.
 
 
 WAL is not eligible to be recycled until there have been 2 successful
 checkpoints.
 
 So at the end of a checkpoint, you have 1 cycle of WAL which has just become
 eligible for recycling,
 1 cycle of WAL which is now expendable but which is kept anyway, and
 checkpoint_completion_target worth of WAL which has occurred while the
 checkpoint was occurring and is still needed for crash recovery.

OK, so based on this analysis, what is the right calculation?  This?

(1 + checkpoint_completion_target) * checkpoint_segments + 1 +
max(wal_keep_segments, checkpoint_segments)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-10-13 Thread Bruce Momjian
On Mon, Aug 25, 2014 at 07:12:33AM +0200, Guillaume Lelarge wrote:
 Le 8 août 2014 09:08, Guillaume Lelarge guilla...@lelarge.info a écrit :
 
  Hi,
 
  As part of our monitoring work for our customers, we stumbled upon an issue
 with our customers' servers who have a wal_keep_segments setting higher than 
 0.
 
  We have a monitoring script that checks the number of WAL files in the
 pg_xlog directory, according to the setting of three parameters
 (checkpoint_completion_target, checkpoint_segments, and wal_keep_segments). We
 usually add a percentage to the usual formula:
 
  greatest(
    (2 + checkpoint_completion_target) * checkpoint_segments + 1,
    checkpoint_segments + wal_keep_segments + 1
  )
 
  And we have lots of alerts from the script for customers who set their
 wal_keep_segments setting higher than 0.
 
  So we started to question this sentence of the documentation:
 
  There will always be at least one WAL segment file, and will normally not be
 more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 or
 checkpoint_segments + wal_keep_segments + 1 files.
 
  (http://www.postgresql.org/docs/9.3/static/wal-configuration.html)
 
  While doing some tests, it appears it would be more something like:
 
  wal_keep_segments + (2 + checkpoint_completion_target) * checkpoint_segments
 + 1
 
  But after reading the source code (src/backend/access/transam/xlog.c), the
 right formula seems to be:
 
  wal_keep_segments + 2 * checkpoint_segments + 1
 
  Here is how we went to this formula...
 
  CreateCheckPoint(..) is responsible, among other things, for deleting and
 recycling old WAL files. From src/backend/access/transam/xlog.c, master 
 branch,
 line 8363:
 
  /*
   * Delete old log files (those no longer needed even for previous
   * checkpoint or the standbys in XLOG streaming).
   */
  if (_logSegNo)
  {
      KeepLogSeg(recptr, _logSegNo);
      _logSegNo--;
      RemoveOldXlogFiles(_logSegNo, recptr);
  }
 
  KeepLogSeg(...) function takes care of wal_keep_segments. From src/backend/
 access/transam/xlog.c, master branch, line 8792:
 
  /* compute limit for wal_keep_segments first */
  if (wal_keep_segments  0)
  {
      /* avoid underflow, don't go below 1 */
      if (segno = wal_keep_segments)
          segno = 1;
      else
          segno = segno - wal_keep_segments;
  }
 
  IOW, the segment number (segno) is decremented according to the setting of
 wal_keep_segments. segno is then sent back to CreateCheckPoint(...) via
 _logSegNo. The RemoveOldXlogFiles() gets this segment number so that it can
 remove or recycle all files before this segment number. This function gets the
 number of WAL files to recycle with the XLOGfileslop constant, which is 
 defined
 as:
 
  /*
   * XLOGfileslop is the maximum number of preallocated future XLOG segments.
   * When we are done with an old XLOG segment file, we will recycle it as a
   * future XLOG segment as long as there aren't already XLOGfileslop future
   * segments; else we'll delete it.  This could be made a separate GUC
   * variable, but at present I think it's sufficient to hardwire it as
   * 2*CheckPointSegments+1.  Under normal conditions, a checkpoint will free
   * no more than 2*CheckPointSegments log segments, and we want to recycle 
  all
   * of them; the +1 allows boundary cases to happen without wasting a
   * delete/create-segment cycle.
   */
  #define XLOGfileslop    (2*CheckPointSegments + 1)
 
  (in src/backend/access/transam/xlog.c, master branch, line 100)
 
  IOW, PostgreSQL will keep wal_keep_segments WAL files before the current WAL
 file, and then there may be 2*CheckPointSegments + 1 recycled ones. Hence the
 formula:
 
  wal_keep_segments + 2 * checkpoint_segments + 1
 
  And this is what we usually find in our customers' servers. We may find more
 WAL files, depending on the write activity of the cluster, but in average, we
 get this number of WAL files.
 
  AFAICT, the documentation is wrong about the usual number of WAL files in 
  the
 pg_xlog directory. But I may be wrong, in which case, the documentation isn't
 clear enough for me, and should be fixed so that others can't misinterpret it
 like I may have done.
 
  Any comments? did I miss something, or should we fix the documentation?

I looked into this, and came up with more questions.  Why is
checkpoint_completion_target involved in the total number of WAL
segments?  If checkpoint_completion_target is 0.5 (the default), the
calculation is:

(2 + 0.5) * checkpoint_segments + 1

while if it is 0.9, it is:

(2 + 0.9) * checkpoint_segments + 1

Is this trying to estimate how many WAL files are going to be created
during the checkpoint?  If so, wouldn't it be (1 +
checkpoint_completion_target), not 2 +.  My logic is you have the old
WAL files being checkpointed (that's the 1), plus you have new WAL
files being created during the checkpoint, which would be
checkpoint_completion_target * checkpoint_segments, plus one for the
current 

Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-08-24 Thread Guillaume Lelarge
Le 8 août 2014 09:08, Guillaume Lelarge guilla...@lelarge.info a écrit :

 Hi,

 As part of our monitoring work for our customers, we stumbled upon an
issue with our customers' servers who have a wal_keep_segments setting
higher than 0.

 We have a monitoring script that checks the number of WAL files in the
pg_xlog directory, according to the setting of three parameters
(checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
We usually add a percentage to the usual formula:

 greatest(
   (2 + checkpoint_completion_target) * checkpoint_segments + 1,
   checkpoint_segments + wal_keep_segments + 1
 )

 And we have lots of alerts from the script for customers who set their
wal_keep_segments setting higher than 0.

 So we started to question this sentence of the documentation:

 There will always be at least one WAL segment file, and will normally not
be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
or checkpoint_segments + wal_keep_segments + 1 files.

 (http://www.postgresql.org/docs/9.3/static/wal-configuration.html)

 While doing some tests, it appears it would be more something like:

 wal_keep_segments + (2 + checkpoint_completion_target) *
checkpoint_segments + 1

 But after reading the source code (src/backend/access/transam/xlog.c),
the right formula seems to be:

 wal_keep_segments + 2 * checkpoint_segments + 1

 Here is how we went to this formula...

 CreateCheckPoint(..) is responsible, among other things, for deleting and
recycling old WAL files. From src/backend/access/transam/xlog.c, master
branch, line 8363:

 /*
  * Delete old log files (those no longer needed even for previous
  * checkpoint or the standbys in XLOG streaming).
  */
 if (_logSegNo)
 {
 KeepLogSeg(recptr, _logSegNo);
 _logSegNo--;
 RemoveOldXlogFiles(_logSegNo, recptr);
 }

 KeepLogSeg(...) function takes care of wal_keep_segments. From
src/backend/access/transam/xlog.c, master branch, line 8792:

 /* compute limit for wal_keep_segments first */
 if (wal_keep_segments  0)
 {
 /* avoid underflow, don't go below 1 */
 if (segno = wal_keep_segments)
 segno = 1;
 else
 segno = segno - wal_keep_segments;
 }

 IOW, the segment number (segno) is decremented according to the setting
of wal_keep_segments. segno is then sent back to CreateCheckPoint(...) via
_logSegNo. The RemoveOldXlogFiles() gets this segment number so that it can
remove or recycle all files before this segment number. This function gets
the number of WAL files to recycle with the XLOGfileslop constant, which is
defined as:

 /*
  * XLOGfileslop is the maximum number of preallocated future XLOG
segments.
  * When we are done with an old XLOG segment file, we will recycle it as a
  * future XLOG segment as long as there aren't already XLOGfileslop future
  * segments; else we'll delete it.  This could be made a separate GUC
  * variable, but at present I think it's sufficient to hardwire it as
  * 2*CheckPointSegments+1.  Under normal conditions, a checkpoint will
free
  * no more than 2*CheckPointSegments log segments, and we want to recycle
all
  * of them; the +1 allows boundary cases to happen without wasting a
  * delete/create-segment cycle.
  */
 #define XLOGfileslop(2*CheckPointSegments + 1)

 (in src/backend/access/transam/xlog.c, master branch, line 100)

 IOW, PostgreSQL will keep wal_keep_segments WAL files before the current
WAL file, and then there may be 2*CheckPointSegments + 1 recycled ones.
Hence the formula:

 wal_keep_segments + 2 * checkpoint_segments + 1

 And this is what we usually find in our customers' servers. We may find
more WAL files, depending on the write activity of the cluster, but in
average, we get this number of WAL files.

 AFAICT, the documentation is wrong about the usual number of WAL files in
the pg_xlog directory. But I may be wrong, in which case, the documentation
isn't clear enough for me, and should be fixed so that others can't
misinterpret it like I may have done.

 Any comments? did I miss something, or should we fix the documentation?

 Thanks.


Ping?


[HACKERS] Maximum number of WAL files in the pg_xlog directory

2014-08-08 Thread Guillaume Lelarge
Hi,

As part of our monitoring work for our customers, we stumbled upon an issue
with our customers' servers who have a wal_keep_segments setting higher
than 0.

We have a monitoring script that checks the number of WAL files in the
pg_xlog directory, according to the setting of three parameters
(checkpoint_completion_target, checkpoint_segments, and wal_keep_segments).
We usually add a percentage to the usual formula:

greatest(
  (2 + checkpoint_completion_target) * checkpoint_segments + 1,
  checkpoint_segments + wal_keep_segments + 1
)

And we have lots of alerts from the script for customers who set their
wal_keep_segments setting higher than 0.

So we started to question this sentence of the documentation:

There will always be at least one WAL segment file, and will normally not
be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
or checkpoint_segments + wal_keep_segments + 1 files.

(http://www.postgresql.org/docs/9.3/static/wal-configuration.html)

While doing some tests, it appears it would be more something like:

wal_keep_segments + (2 + checkpoint_completion_target) *
checkpoint_segments + 1

But after reading the source code (src/backend/access/transam/xlog.c), the
right formula seems to be:

wal_keep_segments + 2 * checkpoint_segments + 1

Here is how we went to this formula...

CreateCheckPoint(..) is responsible, among other things, for deleting and
recycling old WAL files. From src/backend/access/transam/xlog.c, master
branch, line 8363:

/*
 * Delete old log files (those no longer needed even for previous
 * checkpoint or the standbys in XLOG streaming).
 */
if (_logSegNo)
{
KeepLogSeg(recptr, _logSegNo);
_logSegNo--;
RemoveOldXlogFiles(_logSegNo, recptr);
}

KeepLogSeg(...) function takes care of wal_keep_segments. From
src/backend/access/transam/xlog.c, master branch, line 8792:

/* compute limit for wal_keep_segments first */
if (wal_keep_segments  0)
{
/* avoid underflow, don't go below 1 */
if (segno = wal_keep_segments)
segno = 1;
else
segno = segno - wal_keep_segments;
}

IOW, the segment number (segno) is decremented according to the setting of
wal_keep_segments. segno is then sent back to CreateCheckPoint(...) via
_logSegNo. The RemoveOldXlogFiles() gets this segment number so that it can
remove or recycle all files before this segment number. This function gets
the number of WAL files to recycle with the XLOGfileslop constant, which is
defined as:

/*
 * XLOGfileslop is the maximum number of preallocated future XLOG segments.
 * When we are done with an old XLOG segment file, we will recycle it as a
 * future XLOG segment as long as there aren't already XLOGfileslop future
 * segments; else we'll delete it.  This could be made a separate GUC
 * variable, but at present I think it's sufficient to hardwire it as
 * 2*CheckPointSegments+1.  Under normal conditions, a checkpoint will free
 * no more than 2*CheckPointSegments log segments, and we want to recycle
all
 * of them; the +1 allows boundary cases to happen without wasting a
 * delete/create-segment cycle.
 */
#define XLOGfileslop(2*CheckPointSegments + 1)

(in src/backend/access/transam/xlog.c, master branch, line 100)

IOW, PostgreSQL will keep wal_keep_segments WAL files before the current
WAL file, and then there may be 2*CheckPointSegments + 1 recycled ones.
Hence the formula:

wal_keep_segments + 2 * checkpoint_segments + 1

And this is what we usually find in our customers' servers. We may find
more WAL files, depending on the write activity of the cluster, but in
average, we get this number of WAL files.

AFAICT, the documentation is wrong about the usual number of WAL files in
the pg_xlog directory. But I may be wrong, in which case, the documentation
isn't clear enough for me, and should be fixed so that others can't
misinterpret it like I may have done.

Any comments? did I miss something, or should we fix the documentation?

Thanks.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com