On 09/02/2011 11:10 AM, Tomas Vondra wrote:
My 'ideal' solution would be either to add another GUC (to turn this
on/off) or allow log_checkpoints to have three values

log_checkpoints = {off, normal, detailed}

where 'normal' provides the current output and 'detail' produces this much
verbose output.

If this is going to be acceptable, that's likely the only path it could happen by and still meet what you're looking for. I will just again stress that the part you're working on instrumenting better right now is not actually where larger systems really run into the most problems here, based on what I've seen. I added a series of log messages to 9.1 at DEBUG1, aimed at tracking the sync phase. That's where I see many more checkpoint issues than in the write one. On Linux in particular, it's almost impossible for the write phase to be more of a problem than the sync one.

So the logging you're adding here I don't ever expect to turn on. But I wouldn't argue against an option to handle the logging use-case you're concerned about. Letting people observe for themselves and decide which of the phases is more interesting to their workload seems appropriate. Then users have options for what to log, no matter which type of problem they run into.

If you're expanding log_checkpoints to an enum, for that to handle what I think everybody might ever want (for what checkpoints do now at least), I'd find that more useful if it happened like this instead:

log_checkpoints = {off, on, write, sync, verbose}

I don't think you should change the semantics of off/on, which will avoid breaking existing postgresql.conf files and resources that suggest tuning advice. "write" can toggle on what you're adding; "sync" should control whether the DEBUG1 messages showing the individual file names in the sync phase appear; and "verbose" can include both.

As far as a heuristic for making this less chatty when there's nothing exciting happening goes, I think something based on how much time has passed would be the best one. In your use case, I would guess you don't really care whether a message appears every n%. If I understand you correctly now, you would mainly care about getting enough log detail to know 1) when things are running really slow, or b) often enough that the margin of error in your benchmark results from unaccounted checkpoint writes is acceptable. In both of those cases, I'd think a time-based threshold would be appropriate, and that also deals with the time-based checkpoints, too.

If your logging criteria for the write phase was "display a message any time more than 30 seconds have passed since last seeing one", that would give you only a few lines of output in a boring, normal checkpoint--certainly less than the 9 in-progress samples you're outputting now, at 10% intervals. But in the pathological situations where writes are super slow, your log data would become correspondingly denser, which is exactly what you want in that situation.

I think combining the two makes the most sense: "log when >=30 seconds have passed since the last message, and there's been >=10% more progress made". (Maybe do the progress check before the time one, to cut down on gettimeofday() calls) That would give you 4 in-progress reports during a standard 2.5 minute write phase, and in cases where the checkpoints are taking a long time you'd get as many as 9. That's pretty close to auto-tuning the amount of log output, so the amount of it is roughly proportional to how likely the information it's logging will be interesting.

We certainly don't want to add yet another GUC just to control the frequency. I don't think it will be too hard to put two hard-coded thresholds in and do good enough for just about everyone though. I would probably prefer setting those thresholds to 60 seconds/20% instead. That might not be detailed enough for you though.

Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to