On Sat, 05 Apr 2008 16:37:15 +0100
Heikki Linnakangas [EMAIL PROTECTED] wrote:
May I just say that every person that is currently talking on this
thread is offtopic? Move it to -hackers please.
Joshua D. Drake
--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL
Robert Treat wrote:
1) Alert if checkpointing stops occuring within a reasonable time frame (note
there are failure cases and normal use cases where this might occur) (also
note I'll agree, this isn't common, but the results are pretty disatrous if
it does happen)
What are the normal use
Heikki Linnakangas wrote:
Robert Treat wrote:
2) Can be graphed over time (using rrdtool and others) for trending
checkpoint activity
Hmm. You'd need the historical data to do that properly. In particular,
if two checkpoints happen between the polling interval, you'd miss that.
Yes,
Greg Smith [EMAIL PROTECTED] writes:
On Thu, 3 Apr 2008, Tom Lane wrote:
I'd much rather be spending our time and effort on understanding what
broke for you, and fixing the code so it doesn't happen again.
[ shit happens... ]
Completely fair, but I still don't see how this particular patch
On Fri, 4 Apr 2008, Tom Lane wrote:
(And you still didn't tell me what the actual failure case was.)
Database stops checkpointing. WAL files pile up. In the middle of
backup, system finally dies, and when it starts recovery there's a bad
record in the WAL files--which there are now
Greg Smith [EMAIL PROTECTED] writes:
On Fri, 4 Apr 2008, Tom Lane wrote:
(And you still didn't tell me what the actual failure case was.)
Database stops checkpointing. WAL files pile up. In the middle of
backup, system finally dies, and when it starts recovery there's a bad
record in the
On Fri, 4 Apr 2008, Tom Lane wrote:
The actual advice I'd give to a DBA faced with such a case is to
kill -ABRT the bgwriter and send the stack trace to -hackers.
And that's a perfect example of where they're trying to get to. They
didn't notice the problem until after the crash. The
Greg Smith [EMAIL PROTECTED] writes:
... If they'd have noticed it while the server was up, perhaps because the
last checkpoint value hadn't changed in a long time (which seems like it
might be available via stats even if, as you say, the background writer is
out of its mind at that point),
Tom Lane wrote:
Greg Smith [EMAIL PROTECTED] writes:
... If they'd have noticed it while the server was up, perhaps because the
last checkpoint value hadn't changed in a long time (which seems like it
might be available via stats even if, as you say, the background writer is
out of its
On Friday 04 April 2008 01:59, Tom Lane wrote:
Greg Smith [EMAIL PROTECTED] writes:
On Thu, 3 Apr 2008, Tom Lane wrote:
I'd much rather be spending our time and effort on understanding what
broke for you, and fixing the code so it doesn't happen again.
[ shit happens... ]
Completely
Alvaro Herrera [EMAIL PROTECTED] writes:
Tom Lane wrote:
Greg Smith [EMAIL PROTECTED] writes:
... If they'd have noticed it while the server was up, perhaps because the
last checkpoint value hadn't changed in a long time (which seems like it
might be available via stats even if, as you
Alvaro Herrera [EMAIL PROTECTED] writes:
These kind of things can be monitored externally very easily, say by
Nagios, when the values are available via the database. If you have to
troll the logs, it's quite a bit harder to do it.
I'm not sure about the right values to export -- last
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into SQL.
I suggest using GetCurrentTimestamp() directly instead of time_t and
converting.
--
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting,
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into SQL.
Why is that useful?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 23:21:49 +0100
Heikki Linnakangas [EMAIL PROTECTED] wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking.
Heikki Linnakangas [EMAIL PROTECTED] writes:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into SQL.
Why is that useful?
Does this implementation even work? It looks to me like the
globalStats.last_checkpoint_start/done fields will go back to zero the
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust your bgwriter settings, and it is a
Joshua D. Drake [EMAIL PROTECTED] writes:
Heikki Linnakangas [EMAIL PROTECTED] wrote:
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust your bgwriter settings, and it is a
serious drag to parse postgresql logs for this info.
Robert Treat wrote:
On Thursday 03 April 2008 19:08, Andrew Dunstan wrote:
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are
On Thursday 03 April 2008 19:08, Andrew Dunstan wrote:
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 20:29:18 -0400
Tom Lane [EMAIL PROTECTED] wrote:
Joshua D. Drake [EMAIL PROTECTED] writes:
Heikki Linnakangas [EMAIL PROTECTED] wrote:
Why is that useful?
For knowing how long checkpoints are taking. If they are taking
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 20:45:37 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:
Exposing everything into the log files isn't always sufficient
(says the guy who maintains a remote admin tool)
It should be now that you can have machine readable
Joshua D. Drake [EMAIL PROTECTED] writes:
I would agree with this. We would need a history of checkpoints that
didn't reset until we told it to.
Indeed, but the submitted patch has nought whatsoever to do with that.
It exposes some instantaneous state.
You could perhaps *build* a log facility
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 21:26:46 -0400
Tom Lane [EMAIL PROTECTED] wrote:
Joshua D. Drake [EMAIL PROTECTED] writes:
I would agree with this. We would need a history of checkpoints that
didn't reset until we told it to.
Indeed, but the submitted
On Apr 3, 2008, at 7:08 PM, Andrew Dunstan wrote:
Joshua D. Drake wrote:
Theo Schlossnagle wrote:
First whack at exposing the start and finish checkpoint times into
SQL.
Why is that useful?
For knowing how long checkpoints are taking. If they are taking too
long you may need to
Joshua D. Drake wrote:
Exposing everything into the log files isn't always sufficient
(says the guy who maintains a remote admin tool)
It should be now that you can have machine readable logs (says the
guy who literally spent weeks making that happen) ;-)
And how does the
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 21:44:00 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:
I think there is quite possibly a good case for keeping some
diagnostics in a table or tables, on a rolling basis, maybe. But then
that's a facility that needs to be
Theo Schlossnagle [EMAIL PROTECTED] writes:
Heikki: It it useful for knowing when the last checkpoint occurred.
I guess I'm wondering why that's important. In the current bgwriter
design, the system spends half its time checkpointing (or in general
checkpoint_completion_target % of the
Theo Schlossnagle wrote:
Has this feature been discussed on -hackers? I don't recall it (and
my memory has plenty of holes in it), but I'm sure that after
attending my talk last Sunday Theo hasn't sent in a patch for an
undiscussed feature ;-)
Andrew: I don't think this feature has
On Thursday 03 April 2008 21:14, Joshua D. Drake wrote:
On Thu, 03 Apr 2008 20:29:18 -0400
Tom Lane [EMAIL PROTECTED] wrote:
Joshua D. Drake [EMAIL PROTECTED] writes:
Heikki Linnakangas [EMAIL PROTECTED] wrote:
Why is that useful?
For knowing how long checkpoints are taking. If
Robert Treat [EMAIL PROTECTED] writes:
Tom Lane [EMAIL PROTECTED] wrote:
3. As of PG 8.3, the bgwriter tries very hard to make the elapsed time
of a checkpoint be just about checkpoint_timeout *
checkpoint_completion_target, regardless of load factors. So unless
your settings are completely
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Thu, 03 Apr 2008 22:33:15 -0400
Tom Lane [EMAIL PROTECTED] wrote:
JD seems to be on record that the existing logging mechanism sucks
and he needs something else. That's fine, but I think it means that
we need to improve logging in general, not
On Apr 3, 2008, at 10:33 PM, Tom Lane wrote:
Theo claimed he had a reason for wanting to know the latest checkpoint
time, *without* any intention of time-extended tracking of that; but
he didn't say what it was. If there is a credible reason for that
then it might justify a patch of this
Theo Schlossnagle [EMAIL PROTECTED] writes:
On Apr 3, 2008, at 10:33 PM, Tom Lane wrote:
Theo claimed he had a reason for wanting to know the latest checkpoint
time, *without* any intention of time-extended tracking of that; but
he didn't say what it was.
We had a recent event where the
On Thu, 3 Apr 2008, Robert Treat wrote:
You can plug a single item graphed over time into things like rrdtool to
get good trending information. And it's often easier to do this using
sql interfaces to get the data than pulling it out of log files (almost
like the db was designed for that :-)
Greg Smith wrote:
On Thu, 3 Apr 2008, Robert Treat wrote:
You can plug a single item graphed over time into things like rrdtool to
get good trending information. And it's often easier to do this using
sql interfaces to get the data than pulling it out of log files (almost
like the db
On Thu, 3 Apr 2008, Joshua D. Drake wrote:
For knowing how long checkpoints are taking. If they are taking too
long you may need to adjust your bgwriter settings, and it is a
serious drag to parse postgresql logs for this info.
There's some disconnect here between what I think you want here
On Thu, 3 Apr 2008, Tom Lane wrote:
As of PG 8.3, the bgwriter tries very hard to make the elapsed time of a
checkpoint be just about checkpoint_timeout *
checkpoint_completion_target, regardless of load factors.
In the cases where the timing on checkpoint writes are timeout driven.
When
On Friday 04 April 2008 00:09, Greg Smith wrote:
On Thu, 3 Apr 2008, Robert Treat wrote:
You can plug a single item graphed over time into things like rrdtool to
get good trending information. And it's often easier to do this using
sql interfaces to get the data than pulling it out of log
Robert Treat [EMAIL PROTECTED] writes:
I have to add, given that we already provide the time of last checkpoint
information via pg_controldata, I don't understand why people are against
making that information accesible to remote clients.
So, I can expect to see a patch next week that
On Thu, 3 Apr 2008, Tom Lane wrote:
the system stopped checkpointing does not strike me as a routine
occurrence that we should be making provisions for DBAs to watch for.
What, pray tell, is the DBA supposed to do when and if he notices that?
Schedule downtime rather than wait for it to
41 matches
Mail list logo