Re: [PATCHES] Logging checkpoints and other slowdown causes

Heikki Linnakangas Fri, 11 May 2007 03:04:06 -0700

Greg Smith wrote:

On Tue, 3 Apr 2007, Peter Eisentraut wrote:
Something that is aimed at a user should not be enabled at a "debug"
level.  Debug levels are for debugging, not for very high verbosity.
I asked for feedback about where to log at when I intially sent thefirst version of this in and didn't hear anything back on that part, soI pushed these in line with other log messages I saw. The messages forwhen checkpoints start and stop were both logged at DEBUG2, so I putprogress reports on the other significant phases of the process there aswell.


I agree that debug levels are not suitable for this.

I'm thinking of INFO, NOTICE or LOG. The user manual says about LOG:

LOG

Reports information of interest to administrators, e.g., checkpointactivity.

But looking at the code, all the checkpoint related messages are atDEBUG-levels, nothing gets printed at LOG-level. Printing the messagesat LOG-level would bring the code in line with the documentation, but Idon't think we want to fill the log with checkpoint chatter unless theDBA explicitly asks for that.

How about INFO? It seems like the best level for information on normalactivity of the system. The documentation however has this to say about it:


INFO

Provides information implicitly requested by the user, e.g., duringVACUUM VERBOSE.

We should adjust the documentation, but INFO seems like the best levelto me. Or we could add a GUC variable similar to log_connections orlog_statement to control if the messages are printed or not, and use LOG.

I don't expect these messages will be helpful for a normal user--that'swhat the new data in pg_stats_bgwriter is for. Their main purpose ofthis patch is debugging checkpoint related performance issues at a levelI'd expect only a developer to work at; they're also helpful for someonewriting benchmark code.

I disagree. They would be helpful to an administrator chasing downcheckpoint related problems. E.g. checkpoints taking too long, occurringtoo often (though we already have log_checkpoint_warning for that), orto identify if transient performance problems that users complain aboutcoincide with checkpoints. And at least I like to have messages likethat in the log just to comfort me that everything is going ok.

There are several patches in process floating around that aim to adjusteither the background writer or the checkpoint process to reduce theimpact of checkpoints. This logging allows grading their success atthat. As my tests with this patch in place suggest this problem is farfrom solved with any of the current suggestions, I'd like to get otherdevelopers looking at that problem the same way I have been; that's whyI'd like to see some standardization on how checkpoints areinstrumented. The fact that really advanced users might also use thisfor troubleshooting I consider a bonus rather than the main focus here.


Agreed.

Looking at the patch, I think we want one line in the log whencheckpoint starts, one when buffer flushing starts (or maybe not, if weassume that checkpointing clog, subtrans and multixact don't take long),one when sync-phase starts and when the checkpoint is done. We don'tneed to print the times elapsed in each phase on a separate line, that'sjust derived information from the other lines, unless we use differentlog-levels for detail lines (like you did in your patch).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [PATCHES] Logging checkpoints and other slowdown causes

Reply via email to