On Fri, 2008-04-04 at 02:21 -0400, Greg Smith wrote:
> Database stops checkpointing. WAL files pile up. In the middle of
> backup, system finally dies, and when it starts recovery there's a bad
> record in the WAL files--which there are now thousands of to apply, and
> the bad one is 4 hours of replay in. Believe it or not, it goes downhill
> from there.
> It's what kicked off the first step that's the big mystery. The only code
> path I thought of that can block checkpoints like this is when the
> archive_command isn't working anymore, and that wasn't being used. Given
> some of the other corruption found later and the bad memory issues
> discovered, a bit flipping in the "do I need to checkpoint now?" code or
> data seems just as likely as any other explanation.
A few additional comments here:
If you set checkpoint_segments very, very high you can avoid a
checkpoint via checkpoint_timeout for up to 60 minutes. If you did this
for performance reasons presumably you've got lots of WAL files and
might end up with 1000s of them in that time period.
If you set it too high, you hit the disk limits first and can then crash
the server if the pg_xlog directory's physical limits are unluckily low
Starvation of the checkpoint start lock has been witnessed previously,
so if you're running 8.2 or previous that could be a possible
explanation here. What can happen is that a checkpoint is triggered yet
the bgwriter needs to wait to get access to the CheckpointStartLock. I
witnessed a starvation of 3 minutes once during testing a server running
at max velocity with 200 users, in 2006. I assumed that was an outlier,
but its possible for that to be longer. I wouldn't believe too much
longer, though. That was patched in 8.3 as a result.
Anyway, either of those factors, or their combination, plus a small
pg_xlog disk would be sufficient to explain the crash and wal file build
Sent via pgsql-patches mailing list (firstname.lastname@example.org)
To make changes to your subscription: