Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-19 Thread Greg Smith

On 09/05/2011 07:52 PM, Tomas Vondra wrote:

If your logging criteria for the write phase was display a message any
time more than 30 seconds have passed since last seeing one, that would
give you only a few lines of output in a boring, normal
checkpoint--certainly less than the 9 in-progress samples you're
outputting now, at 10% intervals.  But in the pathological situations
where writes are super slow, your log data would become correspondingly
denser, which is exactly what you want in that situation.
 

I still am not sure what should be a reasonable value or how to determine
it. What happens when the checkpoint_timeout is increased, there's more
shared_buffers etc.? What about using (checkpoint_timeout/10) for the
time-based checkpoints and 30s for the other checkpoints?
   


That may work fine.  Maybe implement it like that, and see if the amount 
of logging detail is reasonable in a couple of test scenarios.



I think combining the two makes the most sense:  log when=30 seconds
have passed since the last message, and there's been=10% more progress
made.  (Maybe do the progress check before the time one, to cut down on
 

Is this is a good idea? The case when the timeout expires and not much
data was written is interesting, and this would not log it. But OTOH this
would nicely solve the issue with time-based checkpoints and a fixed
threshold.
   


One thing I am trying to avoid here is needing to check the system clock 
after every buffer write.  I also consider it useful to put an upper 
bound on how many of these messages will appear even in the verbose 
mode.  This deals with both those problems.


Yes, there is a potential problem with this idea.  Let's say checkpoint 
writes degrade to where they take an hour.  In that case, you won't see 
the first progress report until 6 minutes (10%) have gone by with this 
implementation.  I don't see a good way to resolve that without 
violating one of the other priorities I listed above though.  You'll 
have to poll the system clock constantly and will end up creating a lot 
of log entries if you don't do a check against the % progress first.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-19 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 One thing I am trying to avoid here is needing to check the system clock 
 after every buffer write.

On machines where gettimeofday is slow (and last I heard there were
still lots of them), any such thing would be a disaster
performance-wise.  I'm still afraid to add more gettimeofday's into the
query parse/plan/execute code path, even though it would greatly ease
the problem of figuring out whether re-planning is worthwhile.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-19 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 On machines where gettimeofday is slow (and last I heard there were
 still lots of them), any such thing would be a disaster
 performance-wise.  I'm still afraid to add more gettimeofday's into the
 query parse/plan/execute code path, even though it would greatly ease
 the problem of figuring out whether re-planning is worthwhile.

Excuse my ignorance here, but is SIGALARM less of a problem?  Then you
could ask the system for an alarm next second and count the alarms
rather than poll the clock.  We don't need high precision in both those
cases I guess.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-05 Thread Tomas Vondra
On 2 Září 2011, 22:04, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 On 2 Z?? 2011, 21:23, Tom Lane wrote:
 Well, to be blunt, putting stuff into the postmaster log is entirely
 the
 wrong way to satify a requirement like that.  If you want to expose
 progress information, it should be exposed via something dynamic like
 pg_stat_activity.  What could be useful to log is statistics that
 people
 might want to aggregate later, and I don't immediately see a reason why
 such stats couldn't be logged just once at end of each checkpoint
 cycle.

 The problem with pg_stat_activity is that it provides just 'current
 state', no history. If you don't sample that often enough, you may
 completely miss the checkpoint (and thus you'll have no info about it,
 unless you enable log_checkpoints and check the log). And it's imposible
 to use if you need info about something that happened in the past. And
 AFAIK it does not show processes running timed checkpoints for example.

 Your requirements seem sufficiently slippery that I don't think you've
 thought them through very well.  As far as I can see, the proposed patch
 will mostly result in bloating the postmaster log with repetitive
 information of next to no value.  I can see the reason for wanting to
 know what the system is doing right now, and I can see the reason for
 wanting aggregatable statistics so that you can tell over time whether
 your settings need to be adjusted.  I don't see the value in a lot of
 10% done log entries --- there is essentially no historical value in
 such, IMO, because they don't correspond to any user-level activity.
 (Which is what distinguishes this from, say, log_connections or
 log_statements.) The fact that you can't figure out a reasonable
 frequency for making the entries is a symptom of the design being wrong
 at its core.

No, I probably haven't thought through all the requirements - but that's
exactly the feedback I expect from a commit fest.

Originally I've built the patch to fix a single use-case, where the 'log
each 10%' approach made perfect sense. I'm the first one to admin this is
not a good approach for the other use-cases and that the frequency should
be figured out in a different way.

Anyway I don't think that a piece of information that does not correspond
to any user-level activity is useless. Checkpoints are often a very
intrusively and significantly influence the users - that's why I believe
any help to tune them is desirable. The ability to see what the system is
doing right now and aggregatable statistics are fine, but in many cases
they're useless (e.g. when investigating past events).

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-05 Thread Tomas Vondra
On 3 Září 2011, 8:19, Greg Smith wrote:
 If you're expanding log_checkpoints to an enum, for that to handle what
 I think everybody might ever want (for what checkpoints do now at
 least), I'd find that more useful if it happened like this instead:

 log_checkpoints = {off, on, write, sync, verbose}

 I don't think you should change the semantics of off/on, which will
 avoid breaking existing postgresql.conf files and resources that suggest
 tuning advice.  write can toggle on what you're adding; sync should
 control whether the DEBUG1 messages showing the individual file names in
 the sync phase appear; and verbose can include both.

Thanks, those are definitely good ideas extending the original patch and
making it much more useful I guess.

 As far as a heuristic for making this less chatty when there's nothing
 exciting happening goes, I think something based on how much time has
 passed would be the best one.  In your use case, I would guess you don't
 really care whether a message appears every n%.  If I understand you
 correctly now, you would mainly care about getting enough log detail to
 know 1) when things are running really slow, or b) often enough that the
 margin of error in your benchmark results from unaccounted checkpoint
 writes is acceptable.  In both of those cases, I'd think a time-based
 threshold would be appropriate, and that also deals with the time-based
 checkpoints, too.

Yes, the time-based threshold seems like the right solution.

 If your logging criteria for the write phase was display a message any
 time more than 30 seconds have passed since last seeing one, that would
 give you only a few lines of output in a boring, normal
 checkpoint--certainly less than the 9 in-progress samples you're
 outputting now, at 10% intervals.  But in the pathological situations
 where writes are super slow, your log data would become correspondingly
 denser, which is exactly what you want in that situation.

I still am not sure what should be a reasonable value or how to determine
it. What happens when the checkpoint_timeout is increased, there's more
shared_buffers etc.? What about using (checkpoint_timeout/10) for the
time-based checkpoints and 30s for the other checkpoints?

 I think combining the two makes the most sense:  log when =30 seconds
 have passed since the last message, and there's been =10% more progress
 made.  (Maybe do the progress check before the time one, to cut down on

Is this is a good idea? The case when the timeout expires and not much
data was written is interesting, and this would not log it. But OTOH this
would nicely solve the issue with time-based checkpoints and a fixed
threshold.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-05 Thread Robert Haas
On Mon, Sep 5, 2011 at 7:52 PM, Tomas Vondra t...@fuzzy.cz wrote:
 If your logging criteria for the write phase was display a message any
 time more than 30 seconds have passed since last seeing one, that would
 give you only a few lines of output in a boring, normal
 checkpoint--certainly less than the 9 in-progress samples you're
 outputting now, at 10% intervals.  But in the pathological situations
 where writes are super slow, your log data would become correspondingly
 denser, which is exactly what you want in that situation.

 I still am not sure what should be a reasonable value or how to determine
 it. What happens when the checkpoint_timeout is increased, there's more
 shared_buffers etc.? What about using (checkpoint_timeout/10) for the
 time-based checkpoints and 30s for the other checkpoints?

I think the idea here is that we only need to log a message often
enough that the admin who is sitting there watching this won't get too
impatient waiting for the next one.  As that's not a function of
checkpoint_timeout, I don't see much value in conditioning this on
that.  +1 for the suggestion of 30s intervals - that seems infrequent
enough not to result in too much log spam, but sufficiently frequent
that anyone who is concerned about checkpoint progress won't have to
wait terribly long to find out how things are going.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-03 Thread Greg Smith

On 09/02/2011 11:10 AM, Tomas Vondra wrote:

My 'ideal' solution would be either to add another GUC (to turn this
on/off) or allow log_checkpoints to have three values

log_checkpoints = {off, normal, detailed}

where 'normal' provides the current output and 'detail' produces this much
verbose output.
   


If this is going to be acceptable, that's likely the only path it could 
happen by and still meet what you're looking for.  I will just again 
stress that the part you're working on instrumenting better right now is 
not actually where larger systems really run into the most problems 
here, based on what I've seen.  I added a series of log messages to 9.1 
at DEBUG1, aimed at tracking the sync phase.  That's where I see many 
more checkpoint issues than in the write one.  On Linux in particular, 
it's almost impossible for the write phase to be more of a problem than 
the sync one.


So the logging you're adding here I don't ever expect to turn on.  But I 
wouldn't argue against an option to handle the logging use-case you're 
concerned about.  Letting people observe for themselves and decide which 
of the phases is more interesting to their workload seems appropriate.  
Then users have options for what to log, no matter which type of problem 
they run into.


If you're expanding log_checkpoints to an enum, for that to handle what 
I think everybody might ever want (for what checkpoints do now at 
least), I'd find that more useful if it happened like this instead:


log_checkpoints = {off, on, write, sync, verbose}

I don't think you should change the semantics of off/on, which will 
avoid breaking existing postgresql.conf files and resources that suggest 
tuning advice.  write can toggle on what you're adding; sync should 
control whether the DEBUG1 messages showing the individual file names in 
the sync phase appear; and verbose can include both.


As far as a heuristic for making this less chatty when there's nothing 
exciting happening goes, I think something based on how much time has 
passed would be the best one.  In your use case, I would guess you don't 
really care whether a message appears every n%.  If I understand you 
correctly now, you would mainly care about getting enough log detail to 
know 1) when things are running really slow, or b) often enough that the 
margin of error in your benchmark results from unaccounted checkpoint 
writes is acceptable.  In both of those cases, I'd think a time-based 
threshold would be appropriate, and that also deals with the time-based 
checkpoints, too.


If your logging criteria for the write phase was display a message any 
time more than 30 seconds have passed since last seeing one, that would 
give you only a few lines of output in a boring, normal 
checkpoint--certainly less than the 9 in-progress samples you're 
outputting now, at 10% intervals.  But in the pathological situations 
where writes are super slow, your log data would become correspondingly 
denser, which is exactly what you want in that situation.


I think combining the two makes the most sense:  log when =30 seconds 
have passed since the last message, and there's been =10% more progress 
made.  (Maybe do the progress check before the time one, to cut down on 
gettimeofday() calls)  That would give you 4 in-progress reports during 
a standard 2.5 minute write phase, and in cases where the checkpoints 
are taking a long time you'd get as many as 9.  That's pretty close to 
auto-tuning the amount of log output, so the amount of it is roughly 
proportional to how likely the information it's logging will be 
interesting.


We certainly don't want to add yet another GUC just to control the 
frequency.  I don't think it will be too hard to put two hard-coded 
thresholds in and do good enough for just about everyone though.  I 
would probably prefer setting those thresholds to 60 seconds/20% 
instead.  That might not be detailed enough for you though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Magnus Hagander
On Thu, Sep 1, 2011 at 21:59, Tomas Vondra t...@fuzzy.cz wrote:
 I've prepared a significantly simplified version of the patch. The two
 main changes are

 (a) it does not update the pg_stat_bgwriter anymore, it just prints an
 info to the server log
 (b) a new GUC is not required, it's driven by the log_checkpoints

The comment still refers to the checkpoint_update_limit.


 This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
 30%, ...) and whenever 5 seconds since the last log elapses. The time is
 not checked for each buffer but for 128 buffers.

 So if the checkpoint is very slow, you'll get a message every 5 seconds,
 if it's fast you'll get 10 messages.

I would personally find this very annoying. If I read it correctly,
anybody with a database with no problem at all but that has
log_checkpoints on suddenly got at least 10 times as many messages? I
generally try to advise my clients to *not* log excessively because
then they will end up not bothering to read the logs...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 9:47, Magnus Hagander wrote:
 On Thu, Sep 1, 2011 at 21:59, Tomas Vondra t...@fuzzy.cz wrote:
 I've prepared a significantly simplified version of the patch. The two
 main changes are

 (a) it does not update the pg_stat_bgwriter anymore, it just prints an
 info to the server log
 (b) a new GUC is not required, it's driven by the log_checkpoints

 The comment still refers to the checkpoint_update_limit.

OK, I'll fix that.

 This version will log at least 10 'checkpoint status' lines (at 10%,
 20%,
 30%, ...) and whenever 5 seconds since the last log elapses. The time is
 not checked for each buffer but for 128 buffers.

 So if the checkpoint is very slow, you'll get a message every 5 seconds,
 if it's fast you'll get 10 messages.

 I would personally find this very annoying. If I read it correctly,
 anybody with a database with no problem at all but that has
 log_checkpoints on suddenly got at least 10 times as many messages? I
 generally try to advise my clients to *not* log excessively because
 then they will end up not bothering to read the logs...

What about logging it with a lower level, e.g. NOTICE instead of the
current LOG? If that's not a solution then a new GUC is needed I guess.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Magnus Hagander
On Fri, Sep 2, 2011 at 11:01, Tomas Vondra t...@fuzzy.cz wrote:
 On 2 Září 2011, 9:47, Magnus Hagander wrote:
 On Thu, Sep 1, 2011 at 21:59, Tomas Vondra t...@fuzzy.cz wrote:
 I've prepared a significantly simplified version of the patch. The two
 main changes are

 (a) it does not update the pg_stat_bgwriter anymore, it just prints an
 info to the server log
 (b) a new GUC is not required, it's driven by the log_checkpoints

 The comment still refers to the checkpoint_update_limit.

 OK, I'll fix that.

 This version will log at least 10 'checkpoint status' lines (at 10%,
 20%,
 30%, ...) and whenever 5 seconds since the last log elapses. The time is
 not checked for each buffer but for 128 buffers.

 So if the checkpoint is very slow, you'll get a message every 5 seconds,
 if it's fast you'll get 10 messages.

 I would personally find this very annoying. If I read it correctly,
 anybody with a database with no problem at all but that has
 log_checkpoints on suddenly got at least 10 times as many messages? I
 generally try to advise my clients to *not* log excessively because
 then they will end up not bothering to read the logs...

 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I guess.

I guess if it's at a DEBUG level it won't annoy anybody who doesn't
need it. Not sure if NOTICE is low enough..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 12:45, Magnus Hagander wrote:
 On Fri, Sep 2, 2011 at 11:01, Tomas Vondra t...@fuzzy.cz wrote:
 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I guess.

 I guess if it's at a DEBUG level it won't annoy anybody who doesn't
 need it. Not sure if NOTICE is low enough..

I've changed the level to NOTICE. I guess I could put that to info, but
the debug levels seem too low to me. Is there a rule of a thumb about
where to put messages?

Tomasdiff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 4c7cfb0..a9bdd89
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
*** BufferSync(int flags)
*** 1175,1180 
--- 1175,1193 
int num_to_write;
int num_written;
int mask = BM_DIRTY;
+   
+   int num_since_update;
+   
+   longcurr_secs,
+   total_secs;
+   int curr_usecs,
+   total_usecs;
+   float   curr_time,
+   total_time;
+   
+   TimestampTz startTimestamp, lastTimestamp;
+   
+   int log_interval, check_interval;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
*** BufferSync(int flags)
*** 1238,1243 
--- 1251,1265 
buf_id = StrategySyncStart(NULL, NULL);
num_to_scan = NBuffers;
num_written = 0;
+   num_since_update = 0;
+   
+   startTimestamp = GetCurrentTimestamp();
+   lastTimestamp = startTimestamp;
+   
+   /* check the progress each  128 buffers or 10% */
+   log_interval = (int)round(num_to_write/10);
+   check_interval = ((log_interval  128)  (log_interval  0)) ? 
log_interval : 128;
+   
while (num_to_scan--  0)
{
volatile BufferDesc *bufHdr = BufferDescriptors[buf_id];
*** BufferSync(int flags)
*** 1261,1266 
--- 1283,1334 
TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
BgWriterStats.m_buf_written_checkpoints++;
num_written++;
+   num_since_update++;
+   
+   /*
+* Every time we write enough buffers we check 
the time since 
+* last log and whenever we reach 5 seconds or 
10% of the buffers
+* to write, we put a checkpoint status message.
+* 
+* The log message contains info about total 
number of buffers to
+* write, how many buffers are already written, 
average and current
+* write speed and an estimate remaining time.
+*/
+   if ((log_checkpoints)  (num_written % 
check_interval == 0)) {
+ 
+   TimestampDifference(lastTimestamp,
+   GetCurrentTimestamp(),
+   curr_secs, curr_usecs);
+   
+   /* if at least 5 seconds elapsed since 
the last log, log */
+   if ((curr_secs = 5) || 
(num_since_update = log_interval)) {
+   
+   
TimestampDifference(startTimestamp,
+   GetCurrentTimestamp(),
+   total_secs, 
total_usecs);
+   
+   curr_time = curr_secs + 
(float)curr_usecs / 100;
+   total_time = total_secs + 
(float)total_usecs / 100;
+   
+   elog(NOTICE, checkpoint 
status: wrote %d buffers of %d (%.1f%%) in %.1f s; 
+   average %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   current %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   remaining %.1f s,
+   num_written, 
num_to_write, ((float) num_written * 100 / num_to_write),
+   

Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Peter Eisentraut
On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I
 guess.

Changing the log level is not the appropriate solution.  Make it a
configuration parameter.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I
 guess.

 Changing the log level is not the appropriate solution.  Make it a
 configuration parameter.

Frankly, logging as verbose as this is proposed to be is something
that I can't imagine anybody wanting at all, especially not in
production environments.  DEBUG3 or somewhere around there seems fine.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 15:44, Peter Eisentraut wrote:
 On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I
 guess.

 Changing the log level is not the appropriate solution.  Make it a
 configuration parameter.

Why is it inappropriate solution? There's a log_checkpoints GUC that
drives it and you can either get basic info (summary of the checkpoint) or
detailed log (with a lower log level).

In the first patch I've proposed a new GUC (used to set how often the info
should be logged or disable it), but Josh Berkus pointed out that I should
get rid of it if I can. Which is what I've done in the following patches.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Magnus Hagander
On Sep 2, 2011 5:02 PM, Tomas Vondra t...@fuzzy.cz wrote:

 On 2 Září 2011, 15:44, Peter Eisentraut wrote:
  On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
  What about logging it with a lower level, e.g. NOTICE instead of the
  current LOG? If that's not a solution then a new GUC is needed I
  guess.
 
  Changing the log level is not the appropriate solution.  Make it a
  configuration parameter.

 Why is it inappropriate solution? There's a log_checkpoints GUC that
 drives it and you can either get basic info (summary of the checkpoint) or
 detailed log (with a lower log level).

 In the first patch I've proposed a new GUC (used to set how often the info
 should be logged or disable it), but Josh Berkus pointed out that I should
 get rid of it if I can. Which is what I've done in the following patches.

Well, josh doesn't speak for everybody ;-)

Maybe one way could be to change log_checkpoints into an enum of off, on,
debug (values open for bikeshedding of course)

/Magnus


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 16:13, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
 What about logging it with a lower level, e.g. NOTICE instead of the
 current LOG? If that's not a solution then a new GUC is needed I
 guess.

 Changing the log level is not the appropriate solution.  Make it a
 configuration parameter.

 Frankly, logging as verbose as this is proposed to be is something
 that I can't imagine anybody wanting at all, especially not in
 production environments.  DEBUG3 or somewhere around there seems fine.

Yes, it can be a bit verbose, but I don't think it's that verbose. I'm one
of those this information about checkpoint progress may be a valuable info
- even on production systems - that's why I don't want to put that into
the debug levels.

It should write about 10 lines for each checkpoint (altough I've just
realized the current patch won't do that for the timed checkpoints - it'll
probably log significantly more).

My 'ideal' solution would be either to add another GUC (to turn this
on/off) or allow log_checkpoints to have three values

log_checkpoints = {off, normal, detailed}

where 'normal' provides the current output and 'detail' produces this much
verbose output.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 17:08, Magnus Hagander wrote:
 On Sep 2, 2011 5:02 PM, Tomas Vondra t...@fuzzy.cz wrote:

 On 2 Září 2011, 15:44, Peter Eisentraut wrote:
  On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
  What about logging it with a lower level, e.g. NOTICE instead of the
  current LOG? If that's not a solution then a new GUC is needed I
  guess.
 
  Changing the log level is not the appropriate solution.  Make it a
  configuration parameter.

 Why is it inappropriate solution? There's a log_checkpoints GUC that
 drives it and you can either get basic info (summary of the checkpoint)
 or
 detailed log (with a lower log level).

 In the first patch I've proposed a new GUC (used to set how often the
 info
 should be logged or disable it), but Josh Berkus pointed out that I
 should
 get rid of it if I can. Which is what I've done in the following
 patches.

 Well, josh doesn't speak for everybody ;-)

Sure, but I think the effort not to have a zillion of GUC makes sense.

 Maybe one way could be to change log_checkpoints into an enum of off, on,
 debug (values open for bikeshedding of course)

Yes, that's actually one of the solutions I'd prefer. Not sure why I
rejected it ...

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Kevin Grittner
.Tomas Vondra t...@fuzzy.cz wrote:
 On 2 Zá*í 2011, 15:44, Peter Eisentraut wrote:
 
 Changing the log level is not the appropriate solution.  Make it
 a configuration parameter.
 
 Why is it inappropriate solution? There's a log_checkpoints GUC
 that drives it and you can either get basic info (summary of the
 checkpoint) or detailed log (with a lower log level).
 
Maybe?:
 
log_checkpoints = 'verbose'
 
It seems to me that one of the most likely uses of this is for
investigating performance issues in production.  Like with
log_statement, you might want to turn it on briefly and examine
results.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Magnus Hagander
On Sep 2, 2011 5:18 PM, Jaime Casanova ja...@2ndquadrant.com wrote:

 On Fri, Sep 2, 2011 at 10:10 AM, Tomas Vondra t...@fuzzy.cz wrote:
 
  My 'ideal' solution would be either to add another GUC (to turn this
  on/off) or allow log_checkpoints to have three values
 
  log_checkpoints = {off, normal, detailed}
 

 +1 on make it an enum but i prefer values {on, off, verbose}

 where on will log just the same as now

Yeah, keeping on and off working the same way as they do today is a feature
on itself.

/Magnus


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Jaime Casanova
On Fri, Sep 2, 2011 at 10:10 AM, Tomas Vondra t...@fuzzy.cz wrote:

 My 'ideal' solution would be either to add another GUC (to turn this
 on/off) or allow log_checkpoints to have three values

 log_checkpoints = {off, normal, detailed}


+1 on make it an enum but i prefer values {on, off, verbose}

where on will log just the same as now

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Robert Haas
On Thu, Sep 1, 2011 at 3:59 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I've prepared a significantly simplified version of the patch. The two
 main changes are

 (a) it does not update the pg_stat_bgwriter anymore, it just prints an
 info to the server log

 (b) a new GUC is not required, it's driven by the log_checkpoints

 This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
 30%, ...) and whenever 5 seconds since the last log elapses. The time is
 not checked for each buffer but for 128 buffers.

 So if the checkpoint is very slow, you'll get a message every 5 seconds,
 if it's fast you'll get 10 messages.

This seems like a strange heuristic.  I understand the value of
emitting a progress report every N seconds, but why would you want a
report at 10%, 20%, 30% even if it hasn't been 5 seconds yet?  I don't
need ten progress messages on a checkpoint that only takes three
seconds to complete.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Peter Eisentraut
On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
 On 2 Září 2011, 15:44, Peter Eisentraut wrote:
  On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
  What about logging it with a lower level, e.g. NOTICE instead of the
  current LOG? If that's not a solution then a new GUC is needed I
  guess.
 
  Changing the log level is not the appropriate solution.  Make it a
  configuration parameter.
 
 Why is it inappropriate solution? There's a log_checkpoints GUC that
 drives it and you can either get basic info (summary of the checkpoint) or
 detailed log (with a lower log level).

If a user is forced to change the log level to get at one particular
piece of information, they will then also turn on countless other log
events on that level, which is annoying.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
 Why is it inappropriate solution? There's a log_checkpoints GUC that
 drives it and you can either get basic info (summary of the checkpoint) or
 detailed log (with a lower log level).

 If a user is forced to change the log level to get at one particular
 piece of information, they will then also turn on countless other log
 events on that level, which is annoying.

Yeah, if we're going to have this at all, some form of GUC control over
it seems necessary.  I'm still disturbed by the verbosity of the
proposed output though.  Couldn't we collapse the information into a
single log entry per checkpoint cycle?  Perhaps that would be sufficient
to just let the log_checkpoints setting be used as-is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 20:46, Peter Eisentraut wrote:
 On fre, 2011-09-02 at 17:13 +0200, Tomas Vondra wrote:
 Sure, but I think the effort not to have a zillion of GUC makes sense.

 Well, I'll be the first to agree that reducing complexity in
 configuration and tuning settings is worth aiming for.

 But for logging, I'd rather have more settings, theoretically up to one
 for each possible message.  That doesn't increase complexity, as long as
 it has linear behavior.  It's debatable whether that means a new
 log_something parameter for each situation, or just a single parameter
 containing some kind of list, or something else, but that's a different
 problem.

OK, good point. The effort to make this work without a GUC obviously is a
dead end, and the idea to turn log_checkpoints into an enum seems
reasonable.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 20:48, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
 Why is it inappropriate solution? There's a log_checkpoints GUC that
 drives it and you can either get basic info (summary of the checkpoint)
 or
 detailed log (with a lower log level).

 If a user is forced to change the log level to get at one particular
 piece of information, they will then also turn on countless other log
 events on that level, which is annoying.

 Yeah, if we're going to have this at all, some form of GUC control over
 it seems necessary.  I'm still disturbed by the verbosity of the
 proposed output though.  Couldn't we collapse the information into a
 single log entry per checkpoint cycle?  Perhaps that would be sufficient
 to just let the log_checkpoints setting be used as-is.

Yes, the GUC seems inevitable. I'm still working on the verbosity for
different checpoint typees, but I think 5 seconds for xlog and 10% for
timed checkpoint is a reasonable setting.

I'm not sure what you mean by collapsing the info into a single log entry?
That would mean I'd have to wait till the checkpoint completes, and one of
the reasons for this patch was to get info about progress while the
checkpoint is running.

For example I'd like to have this information in cases when the checkpoint
never finishes - for example when performing automated benchmarks. I
sometimes just kill the database (I need to rebuild it from scratch for
the next run) so I don't get any checkpoint message at all. Waiting for
the checkpoint would significantly increase the time for each run and thus
for the whole benchmark. With one run that probably is not a problem, with
360 runs each minute makes a big difference.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 On 2 Září 2011, 20:48, Tom Lane wrote:
 Yeah, if we're going to have this at all, some form of GUC control over
 it seems necessary.  I'm still disturbed by the verbosity of the
 proposed output though.  Couldn't we collapse the information into a
 single log entry per checkpoint cycle?  Perhaps that would be sufficient
 to just let the log_checkpoints setting be used as-is.

 I'm not sure what you mean by collapsing the info into a single log entry?
 That would mean I'd have to wait till the checkpoint completes, and one of
 the reasons for this patch was to get info about progress while the
 checkpoint is running.

Well, to be blunt, putting stuff into the postmaster log is entirely the
wrong way to satify a requirement like that.  If you want to expose
progress information, it should be exposed via something dynamic like
pg_stat_activity.  What could be useful to log is statistics that people
might want to aggregate later, and I don't immediately see a reason why
such stats couldn't be logged just once at end of each checkpoint cycle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tomas Vondra
On 2 Září 2011, 21:23, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 On 2 Z?? 2011, 20:48, Tom Lane wrote:
 Yeah, if we're going to have this at all, some form of GUC control over
 it seems necessary.  I'm still disturbed by the verbosity of the
 proposed output though.  Couldn't we collapse the information into a
 single log entry per checkpoint cycle?  Perhaps that would be
 sufficient
 to just let the log_checkpoints setting be used as-is.

 I'm not sure what you mean by collapsing the info into a single log
 entry?
 That would mean I'd have to wait till the checkpoint completes, and one
 of
 the reasons for this patch was to get info about progress while the
 checkpoint is running.

 Well, to be blunt, putting stuff into the postmaster log is entirely the
 wrong way to satify a requirement like that.  If you want to expose
 progress information, it should be exposed via something dynamic like
 pg_stat_activity.  What could be useful to log is statistics that people
 might want to aggregate later, and I don't immediately see a reason why
 such stats couldn't be logged just once at end of each checkpoint cycle.

The problem with pg_stat_activity is that it provides just 'current
state', no history. If you don't sample that often enough, you may
completely miss the checkpoint (and thus you'll have no info about it,
unless you enable log_checkpoints and check the log). And it's imposible
to use if you need info about something that happened in the past. And
AFAIK it does not show processes running timed checkpoints for example.

So IMHO it's unusable for what I'm trying to achieve.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-02 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 On 2 Září 2011, 21:23, Tom Lane wrote:
 Well, to be blunt, putting stuff into the postmaster log is entirely the
 wrong way to satify a requirement like that.  If you want to expose
 progress information, it should be exposed via something dynamic like
 pg_stat_activity.  What could be useful to log is statistics that people
 might want to aggregate later, and I don't immediately see a reason why
 such stats couldn't be logged just once at end of each checkpoint cycle.

 The problem with pg_stat_activity is that it provides just 'current
 state', no history. If you don't sample that often enough, you may
 completely miss the checkpoint (and thus you'll have no info about it,
 unless you enable log_checkpoints and check the log). And it's imposible
 to use if you need info about something that happened in the past. And
 AFAIK it does not show processes running timed checkpoints for example.

Your requirements seem sufficiently slippery that I don't think you've
thought them through very well.  As far as I can see, the proposed patch
will mostly result in bloating the postmaster log with repetitive
information of next to no value.  I can see the reason for wanting to
know what the system is doing right now, and I can see the reason for
wanting aggregatable statistics so that you can tell over time whether
your settings need to be adjusted.  I don't see the value in a lot of
10% done log entries --- there is essentially no historical value in
such, IMO, because they don't correspond to any user-level activity.
(Which is what distinguishes this from, say, log_connections or
log_statements.)  The fact that you can't figure out a reasonable
frequency for making the entries is a symptom of the design being wrong
at its core.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-01 Thread Tomas Vondra
I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log

(b) a new GUC is not required, it's driven by the log_checkpoints

This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

Tomasdiff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 4c7cfb0..e60d304
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
*** BufferSync(int flags)
*** 1175,1180 
--- 1175,1193 
int num_to_write;
int num_written;
int mask = BM_DIRTY;
+   
+   int num_since_update;
+   
+   longcurr_secs,
+   total_secs;
+   int curr_usecs,
+   total_usecs;
+   float   curr_time,
+   total_time;
+   
+   TimestampTz startTimestamp, lastTimestamp;
+   
+   int log_interval, check_interval;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
*** BufferSync(int flags)
*** 1238,1243 
--- 1251,1265 
buf_id = StrategySyncStart(NULL, NULL);
num_to_scan = NBuffers;
num_written = 0;
+   num_since_update = 0;
+   
+   startTimestamp = GetCurrentTimestamp();
+   lastTimestamp = startTimestamp;
+   
+   /* check the progress each  128 buffers or 10% */
+   log_interval = (int)round(num_to_write/10);
+   check_interval = ((log_interval  128)  (log_interval  0)) ? 
log_interval : 128;
+   
while (num_to_scan--  0)
{
volatile BufferDesc *bufHdr = BufferDescriptors[buf_id];
*** BufferSync(int flags)
*** 1261,1266 
--- 1283,1334 
TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
BgWriterStats.m_buf_written_checkpoints++;
num_written++;
+   num_since_update++;
+   
+   /*
+* Every time we write enough buffers 
(checkpoint_update_limit),
+* we log a checkpoint status message and 
update the bgwriter
+* stats (so that the pg_stat_bgwriter table 
may be updated).
+* 
+* The log message contains info about total 
number of buffers to
+* write, how many buffers are already written, 
average and current
+* write speed and an estimate remaining time.
+*/
+   if ((log_checkpoints)  (num_written % 
check_interval == 0)) {
+ 
+   TimestampDifference(lastTimestamp,
+   GetCurrentTimestamp(),
+   curr_secs, curr_usecs);
+   
+   /* if at least 5 seconds elapsed since 
the last log, log */
+   if ((curr_secs = 5) || 
(num_since_update = log_interval)) {
+   
+   
TimestampDifference(startTimestamp,
+   GetCurrentTimestamp(),
+   total_secs, 
total_usecs);
+   
+   curr_time = curr_secs + 
(float)curr_usecs / 100;
+   total_time = total_secs + 
(float)total_usecs / 100;
+   
+   elog(LOG, checkpoint status: 
wrote %d buffers of %d (%.1f%%) in %.1f s; 
+   average %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   current %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   remaining %.1f s,
+   num_written, 
num_to_write, ((float) num_written * 100 / num_to_write),
+   

Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-29 Thread Tomas Vondra
On 29 Srpen 2011, 7:47, Noah Misch wrote:
 On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:
 On 27 Srpen 2011, 6:01, Noah Misch wrote:
  Could you remove this hazard by adding a step 2a. psql -c
 CHECKPOINT?

 I already do that, but it really does not solve the issue. It just
 aligns
 the first expected 'timed' checkpoint, it does not solve the problem
 with
 in-progress checkpoints unless the runs behave exactly the same (and
 that's the boring case).

 To clarify, run that command _after_ the 10-minute pgbench run.  It
 blocks until completion of both the in-progress checkpoint, if any,
 and the requested checkpoint.

I don't think this is going to help when there is a checkpoint in
progress, as the new CHECKPOINT will wait till the already running one
completes. It might force it to finish faster, but I still can't say how
much data was written when the pgbench was running.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-28 Thread Noah Misch
On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:
 On 27 Srpen 2011, 6:01, Noah Misch wrote:
  On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
  1. collect pg_stat_bgwriter stats
  2. run pgbench for 10 minutes
  3. collect pg_stat_bgwriter stats (to compute difference with (1))
  4. kill the postmaster
 
  The problem is that when checkpoint stats are collected, there might be
  a
  checkpoint in progress and in that case the stats are incomplete. In
  some
  cases (especially with very small db blocks) this has significant impact
  because the checkpoints are less frequent.
 
  Could you remove this hazard by adding a step 2a. psql -c CHECKPOINT?
 
 I already do that, but it really does not solve the issue. It just aligns
 the first expected 'timed' checkpoint, it does not solve the problem with
 in-progress checkpoints unless the runs behave exactly the same (and
 that's the boring case).

To clarify, run that command _after_ the 10-minute pgbench run.  It
blocks until completion of both the in-progress checkpoint, if any,
and the requested checkpoint.

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-27 Thread Greg Smith

On 08/27/2011 12:01 AM, Noah Misch wrote:

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
   

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.
 

Could you remove this hazard by adding a step 2a. psql -c CHECKPOINT?
   


That's what I do in pgbench-tools, and it helps a lot.  It makes it 
easier to identify when the checkpoint kicks in if you know it's 
approximately the same time after each test run begins, given similar 
testing parameters.  That said, it's hard to eliminate all of the edge 
conditions here.


For example, imagine that you're consuming WAL files such that you hit 
checkpoint_segments every 4 minutes.  In a 10 minute test run, a 
checkpoint will start at 4:00 and finish at around 6:00 (with 
checkpoint_completion_target=0.5).  The next will start at 8:00 and 
should finish at around 10:00--right at the end of when the test ends.  
Given the variation that sync timing and rounding issues in the write 
phase adds to things, you can expect that some test runs will include 
stats from 2 checkpoints, while others will end the test just before the 
second one finishes.  It does throw the numbers off a bit.


To avoid this when it pops up, I normally aim to push up to where there 
are =4 checkpoints per test run, just so whether I get n or n-1 of them 
doesn't impact results as much.  But that normally takes doubling the 
length of the test to 20 minutes.  As it will often take me days of test 
time to plow through exploring just a couple of parameters, I'm 
sympathetic to Tomas trying to improve accuracy here without having to 
run for quite so long.  There's few people who have this problem to 
worry about though, it's a common issue with benchmarking but not many 
other contexts.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-27 Thread Tomas Vondra
On 27 Srpen 2011, 6:01, Noah Misch wrote:
 On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
 1. collect pg_stat_bgwriter stats
 2. run pgbench for 10 minutes
 3. collect pg_stat_bgwriter stats (to compute difference with (1))
 4. kill the postmaster

 The problem is that when checkpoint stats are collected, there might be
 a
 checkpoint in progress and in that case the stats are incomplete. In
 some
 cases (especially with very small db blocks) this has significant impact
 because the checkpoints are less frequent.

 Could you remove this hazard by adding a step 2a. psql -c CHECKPOINT?

I already do that, but it really does not solve the issue. It just aligns
the first expected 'timed' checkpoint, it does not solve the problem with
in-progress checkpoints unless the runs behave exactly the same (and
that's the boring case).

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Greg Smith

On 08/25/2011 04:57 PM, Tomas Vondra wrote:

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)
   


The idea behind only updating the stats in one chunk, at the end, is 
that it makes one specific thing easier to do.  Let's say you're running 
a monitoring system that is grabbing snapshots of pg_stat_bgwriter 
periodically.  If you want to figure out how much work a checkpoint did, 
you only need two points of data to compute that right now.  Whenever 
you see either of the checkpoint count numbers increase, you just 
subtract off the previous sample; now you've got a delta for how many 
buffers that checkpoint wrote out.  You can derive the information about 
the buffer counts involved that appears in the logs quite easily this 
way.  The intent was to make that possible to do, so that people can 
figure this out without needing to parse the log data.


Spreading out the updates defeats that idea.  It also makes it possible 
to see the buffer writes more in real-time, as they happen.  You can 
make a case for both approaches having their use cases; the above is 
just summarizing the logic behind why it's done the way it is right 
now.  I don't think many people are actually doing things with this to 
the level where their tool will care.  The most popular consumer of 
pg_stat_bgwriter data I see is Munin graphing changes, and I don't think 
it will care either way.


Giving people the option of doing it the other way is a reasonable idea, 
but I'm not sure there's enough use case there to justify adding a GUC 
just for that.  My next goal here is to eliminate checkpoint_segments, 
not to add yet another tunable extremely few users would ever touch.


As for throwing more log data out, I'm not sure what new analysis you're 
thinking of that it allows.  I/O gets increasingly spiky as you zoom in 
on it; averaging over a shorter period can easily end up providing less 
insight about trends.  If anything, I spend more time summarizing the 
data that's already there, rather than wanting to break them down.  It's 
already providing way too much detail for most people.  Customers tell 
me they don't care to see checkpoint stats unless they're across a day 
or more of sampling, so even the current once every ~5 minutes is way 
more info than they want.  I have all this log parsing code and things 
that look at pg_stat_bgwriter to collect that data and produce higher 
level reports.  And lots of it would break if any of this patch is added 
and people turn it on.  I imagine other log/stat parsing programs might 
suffer issues too.  That's your other hurdle for change here:  the new 
analysis techniques have to be useful enough to justify that some 
downstream tool disruption is inevitable.


If you have an idea for how to use this extra data for something useful, 
let's talk about what that is and see if it's possible to build it in 
instead.  This problem is harder than it looks, mainly because the way 
the OS caches writes here makes trying to derive hard numbers from what 
the background writer is doing impossible.  When the database writes 
things out, and when they actually get written to disk, they are not the 
same event.  The actual write is often during the sync phase, and not 
being able to tracking that beast is where I see the most problems at.  
The write phase, the easier part to instrument in the database, that is 
pretty boring.  That's why the last extra logging I added here focused 
on adding visibility to the sync activity instead.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Magnus Hagander
On Thu, Aug 25, 2011 at 22:57, Tomas Vondra t...@fuzzy.cz wrote:
 Hello,

 I'd like to propose a small patch that allows better checkpoint progress
 monitoring. The patch is quite simple - it adds a new integer GUC
 checkpoint_update_limit and every time checkpoint writes this number of
 buffers, it does two things:

 (a) logs a checkpoint status message into the server log, with info
 about total number of buffers to write, number of already written buffers,
 current and average write speed and estimate of remaining time

 (b) sends bgwriter stats (so that the buffers_checkpoint is updated)

 I believe this will make checkpoint tuning easier, especially with large
 shared bufferers and large when there's other write activity (so that it's
 difficult to see checkpoint I/O).

 The default value (0) means this continuous logging is disabled.

This seems like the wrong thing to write to the log. It's really only
useful *during* the checkpoint run, isn't it? If so, I think it should
go in a pg_stat view. In theory, this could be the progress view or
progress field talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity. Either
way, it should be updated in shared memory in that case (like current
query is), and not sent with a message to the collector.

IMHO, of course ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 9:35, Greg Smith wrote:
 On 08/25/2011 04:57 PM, Tomas Vondra wrote:
 (b) sends bgwriter stats (so that the buffers_checkpoint is updated)

 As for throwing more log data out, I'm not sure what new analysis you're
 thinking of that it allows.  I/O gets increasingly spiky as you zoom in
 on it; averaging over a shorter period can easily end up providing less
 insight about trends.  If anything, I spend more time summarizing the
 data that's already there, rather than wanting to break them down.  It's
 already providing way too much detail for most people.  Customers tell
 me they don't care to see checkpoint stats unless they're across a day
 or more of sampling, so even the current once every ~5 minutes is way
 more info than they want.  I have all this log parsing code and things
 that look at pg_stat_bgwriter to collect that data and produce higher
 level reports.  And lots of it would break if any of this patch is added
 and people turn it on.  I imagine other log/stat parsing programs might
 suffer issues too.  That's your other hurdle for change here:  the new
 analysis techniques have to be useful enough to justify that some
 downstream tool disruption is inevitable.

I was aware that by continuously updating pg_stat_bgwriter, the data won't
be synchronized (i.e. the buffers_checkpoint counters will change while
the number of requested/timed checkpoints remain the same).

But does that really break the tools that process the data? When you're
working with summarized data, the result should be more or less the same
as the difference will be smoothed out by averaging etc. You can always
see just one in progress checkpoint, so if you get 24 checkpoints a day,
the difference will be 1/24 of a checkpoint. Yes, it's a difference.

A really crazy workaround would be to change checkpoints_requested /
checkpoints_timed to double, and use that to indicate current progress of
the checkpoint. So for example 10.54 would mean 10 checkpoints completed,
one checkpoint in progress, already written 54% of blocks. But yes, that's
a bit crazy.

 If you have an idea for how to use this extra data for something useful,
 let's talk about what that is and see if it's possible to build it in
 instead.  This problem is harder than it looks, mainly because the way
 the OS caches writes here makes trying to derive hard numbers from what
 the background writer is doing impossible.  When the database writes
 things out, and when they actually get written to disk, they are not the
 same event.  The actual write is often during the sync phase, and not
 being able to tracking that beast is where I see the most problems at.
 The write phase, the easier part to instrument in the database, that is
 pretty boring.  That's why the last extra logging I added here focused
 on adding visibility to the sync activity instead.

Hmmm, let me explain what led me to this patch - right now I'm doing a
comparison of filesystems with various block sizes (both fs and db
blocks). I've realized that the db block size significantly influences
frequency of checkpoints and amount of data to write, so I'm collecting
data from pg_stat_bgwriter too. The benchmark goes like this

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

I can't infer this from other data (e.g. iostat), because that does allow
me what I/O is caused by the checkpoint.

Yes, this does not consider sync timing, but in my case that's not a big
issue (the page cache is rather small so the data are actually forced to
the drive soon).

I could probably live with keeping the current pg_stat_bgwriter logic
(i.e. updating just once) and writing checkpoint status just to the log. I
don't think that should break any current tools that parse logs, because
the message is completely different (prefixed with 'checkpoint status') so
any reasonably written tool should be OK.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Greg Smith

On 08/26/2011 03:54 AM, Magnus Hagander wrote:

In theory, this could be the progress view or
progress field talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity.


Right.  The whole progress indicator idea is hard to do for queries in 
general.  But there's enough of these other progress indicator ideas 
around now that it may be worth putting a standard way to handle them in 
here.  It sounds like that would be sufficient to address the area Tomas 
is trying to instrument better.  I badly want a progress indicator on 
CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the 
build process it's on.  That's turned into a major headache recently.


If we run with the idea of just allowing backends to publish a progress 
text string, I think this one maps into a similar space as the 
autovacuum one.  Publishing how many seconds the operation has been 
running for may be reasonable too.  Whether the overhead of the timing 
calls necessary to compute that will be high or not depends on the 
refresh rate of the progress info.  My suggestion before was to name 
these as key=value pairs for easy parsing; here's three examples now:


autovacumm:  pgbench_accounts h=182701 m=301515 d=321345 s=62.231
(cache hits, cache misses, dirty writes, seconds)

background writer:  checkpoint b=511 t=3072 s=5.321
(buffers written, total, seconds)

create index concurrently:  pgbench_accounts p=1 b=62 t=6213 s=81.232
(phase, blocks processed, total block estimate, seconds)

I think that the idea of making this easily human readable is 
optimistic, because it will make all these strings big enough to start 
mattering.  Given that, we almost have to assume the only consumers of 
this data will be able to interpret it using the documentation.  I'd be 
happy with just the minimal data set in each case, not including any 
statistics you can easily derive from the values given (like the MB/s 
readings).  Adding that figure in particular to more of the log messages 
would be nice though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 9:54, Magnus Hagander wrote:
 This seems like the wrong thing to write to the log. It's really only
 useful *during* the checkpoint run, isn't it? If so, I think it should
 go in a pg_stat view. In theory, this could be the progress view or
 progress field talked about around Gregs previous patch - or it
 could just be modifying the commandstring in pg_stat_activity. Either
 way, it should be updated in shared memory in that case (like current
 query is), and not sent with a message to the collector.

I personally find it handy for example when I need to find out why
performance degraded at a certain point in the past. Now I can see there
was a checkpoint (thanks to log_checkpoints=on), but I don't know any
details about it's progress.

I already collect info from the pg_stat_bgwriter, that's why I thought I
could update it more often. The log file is a natural destination for such
information, IMHO. I see that as an extension to the current checkpoint
messages that are written to the log.

The commandstring approach is probably fine for interactive work, but not
very handy when you need to analyze something that already happened.
Unless you collected the data, of course.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 19:17, Greg Smith wrote:
 On 08/26/2011 03:54 AM, Magnus Hagander wrote:
 In theory, this could be the progress view or
 progress field talked about around Gregs previous patch - or it
 could just be modifying the commandstring in pg_stat_activity.

 Right.  The whole progress indicator idea is hard to do for queries in
 general.  But there's enough of these other progress indicator ideas
 around now that it may be worth putting a standard way to handle them in
 here.  It sounds like that would be sufficient to address the area Tomas
 is trying to instrument better.  I badly want a progress indicator on
 CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
 build process it's on.  That's turned into a major headache recently.

 If we run with the idea of just allowing backends to publish a progress
 text string, I think this one maps into a similar space as the
 autovacuum one.  Publishing how many seconds the operation has been
 running for may be reasonable too.  Whether the overhead of the timing
 calls necessary to compute that will be high or not depends on the
 refresh rate of the progress info.  My suggestion before was to name
 these as key=value pairs for easy parsing; here's three examples now:

 autovacumm:  pgbench_accounts h=182701 m=301515 d=321345 s=62.231
 (cache hits, cache misses, dirty writes, seconds)

 background writer:  checkpoint b=511 t=3072 s=5.321
 (buffers written, total, seconds)

 create index concurrently:  pgbench_accounts p=1 b=62 t=6213 s=81.232
 (phase, blocks processed, total block estimate, seconds)

 I think that the idea of making this easily human readable is
 optimistic, because it will make all these strings big enough to start
 mattering.  Given that, we almost have to assume the only consumers of
 this data will be able to interpret it using the documentation.  I'd be
 happy with just the minimal data set in each case, not including any
 statistics you can easily derive from the values given (like the MB/s
 readings).  Adding that figure in particular to more of the log messages
 would be nice though.

I'm a bit confused - are you talking about updating process title or about
writing the info to log? The process title is probably fine for watching
the progress interactively, but it really does not solve what I need.

I need to be able to infer the progress for past events, so I'd have to
sample the 'ps ax' output regularly.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Noah Misch
On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
 Hmmm, let me explain what led me to this patch - right now I'm doing a
 comparison of filesystems with various block sizes (both fs and db
 blocks). I've realized that the db block size significantly influences
 frequency of checkpoints and amount of data to write, so I'm collecting
 data from pg_stat_bgwriter too. The benchmark goes like this
 
 1. collect pg_stat_bgwriter stats
 2. run pgbench for 10 minutes
 3. collect pg_stat_bgwriter stats (to compute difference with (1))
 4. kill the postmaster
 
 The problem is that when checkpoint stats are collected, there might be a
 checkpoint in progress and in that case the stats are incomplete. In some
 cases (especially with very small db blocks) this has significant impact
 because the checkpoints are less frequent.

Could you remove this hazard by adding a step 2a. psql -c CHECKPOINT?

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Josh Berkus
Tomas,

 I'd like to propose a small patch that allows better checkpoint progress
 monitoring. The patch is quite simple - it adds a new integer GUC
 checkpoint_update_limit and every time checkpoint writes this number of
 buffers, it does two things:

I'd rather not have a new GUC if we can avoid it.  What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Tomas Vondra
On 26 Srpen 2011, 0:18, Josh Berkus wrote:
 Tomas,

 I'd like to propose a small patch that allows better checkpoint progress
 monitoring. The patch is quite simple - it adds a new integer GUC
 checkpoint_update_limit and every time checkpoint writes this number
 of
 buffers, it does two things:

 I'd rather not have a new GUC if we can avoid it.  What about just
 making this some reasonable value (like 1000 buffers) if log_checkpoints
 = on?

I was thinking about that too, but I think no value can fit all cases
because the systems may have very different I/O subsystems.

With one 7.2k drive I usually get about 25MB/s on average, with big arrays
/ good controllers / fast drives you can write much faster. So either the
value will be too low (and the log will be infested with those messages)
or too high (so it won't be updated very often).

And it depends on the number of shared buffers too. I thought it could be
something like 5% of shared buffers but it really does not solve the
issues. So I think there should be a new GUC.

OTOH, it's probably a good idea to log this only when the log_checkpoints
is enabled as those who tune checkpoints will enable log_checkpoints in
the first case.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-25 Thread Tomas Vondra
On 26 Srpen 2011, 0:39, Tomas Vondra wrote:
 On 26 Srpen 2011, 0:18, Josh Berkus wrote:
 Tomas,

 I'd like to propose a small patch that allows better checkpoint
 progress
 monitoring. The patch is quite simple - it adds a new integer GUC
 checkpoint_update_limit and every time checkpoint writes this number
 of
 buffers, it does two things:

 I'd rather not have a new GUC if we can avoid it.  What about just
 making this some reasonable value (like 1000 buffers) if log_checkpoints
 = on?

 I was thinking about that too, but I think no value can fit all cases
 because the systems may have very different I/O subsystems.

 With one 7.2k drive I usually get about 25MB/s on average, with big arrays
 / good controllers / fast drives you can write much faster. So either the
 value will be too low (and the log will be infested with those messages)
 or too high (so it won't be updated very often).

Hmmm, maybe we could use time instead of number of buffers? Something like
every 5 seconds, log the checkpoint progress instead of every time 1000
buffers is written ...

That should work on systems regardless of I/O performance.

But although using time instead of number of buffers seems like a good
idea, I don't think it eliminates the need for a new GUC.

(a) Even with time limit, I find it useful to be able to set the limits
differently.

(b) In some cases it may be useful to enable just basic (current
behaviour) checkpoint logging using log_checkpoints, so using it for this
new patch may not be a good idea. Although this could be fixed by allowing
three values no/basic/detailed instead of just true/false.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers