On Thu, 3 Apr 2008, Robert Treat wrote:

You can plug a single item graphed over time into things like rrdtool to get good trending information. And it's often easier to do this using sql interfaces to get the data than pulling it out of log files (almost like the db was designed for that :-)

The pg_stat_bgwriter value for buffers_checkpoint was intentionally implemented in 8.3 such that it jumps in one big lump when the checkpoint is done. While it's not the ideal interface for what you're looking for, the reason for that is to made it possible to build a "when was the last checkpoint finished?" interface via some remote monitoring tool just by determining the last time that the value jumped upwards. You can easily see them just by graphing that value, it shouldn't be too hard to teach something with rrdtool guts to find them.

Since checkpoints have a fairly predictable duration in 8.3, as long as you catch the start or end of them you can make a resonable guess where the other side was. The case you're trying to avoid here, the system going a long time without checkpointing, can be implemented by looking for a begin or end regularly, you don't need to track both. As long as there's a checkpoint finish "pulse" in buffers_checkpoint showing up regularly you're fine. The only situation I can think of where this might be problematic is where the system has been idle enough to not have any buffers to write at checkpoint time, but I recall a code path there where checkpoints stop altogether unless there's been activity so even tracking the time may not change that.

Ultimately a lot of the other questions you might ask (i.e. "how many buffers have been written per hour by checkpoints?") require processing the numbers in this way anyway, and I thought this implementation was good enough to monitor the situation you're trying to avoid--presuming you're using some sort of moderately powerful remote monitoring tool. Theo's patch would make it easier to answer with a simple command which has some value; a little SQL in a cron job would be good enough to trigger an alert rather than needing a real monitoring probe.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to