Re: [PATCHES] Expose checkpoint start/finish times into SQL.

Greg Smith Thu, 03 Apr 2008 23:21:47 -0700

On Fri, 4 Apr 2008, Tom Lane wrote:

(And you still didn't tell me what the actual failure case was.)

Database stops checkpointing. WAL files pile up. In the middle ofbackup, system finally dies, and when it starts recovery there's a badrecord in the WAL files--which there are now thousands of to apply, andthe bad one is 4 hours of replay in. Believe it or not, it goes downhillfrom there.

It's what kicked off the first step that's the big mystery. The only codepath I thought of that can block checkpoints like this is when thearchive_command isn't working anymore, and that wasn't being used. Givensome of the other corruption found later and the bad memory issuesdiscovered, a bit flipping in the "do I need to checkpoint now?" code ordata seems just as likely as any other explanation.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Expose checkpoint start/finish times into SQL.

Reply via email to