Simon Riggs wrote:
On Tue, 2008-05-06 at 17:52 +0100, Heikki Linnakangas wrote:
I didn't suggest that alphabetical sorting property is a new assumption; it sure isn't. The new assumption is that you never call ReadRecord() for record 0002 before you call it for record 0001 (before initializing the checkPointCopy field from the checkpoint record, anyway).

I've not done anything with the ordering of calls to ReadRecord(). There
is no such assumption introduced here.

Hmm, I see that I was inaccurate when I said the patch introduces that assumption, sorry about the confusion. It's more like the code is currently completely oblivious of the issue, and the patch makes it better but doesn't fully fix it.

The code in CVS is broken, as we now know, because it assumes that we can delete all files < checkPointCopy.redo, which is not true when checkPointCopy.redo has not yet been initialized from the backup label file, and points to a location that's ahead of the real safe truncation point. The patch makes it better, and the assumption is now that it's safe to truncate everything < min(checkPointCopy.redo, xlog file we're reading). That still seems too fragile to me, because we assume that after you've asked for xlog record X, we never need anything earlier then that.

In fact, what will happen if the checkpoint record's redo pointer points to an earlier xlog file:

1. The location of the checkpoint record is read by read_backup_label(). Let's say that it's 0005. 2. ReadCheckpointRecord() is called for 0005. The restore command is called because that xlog file is not present. The safe truncation point is determined to be 0005, as that's what we're reading. Everything before that is truncated 3. The redo pointer in the checkpoint record points to 0003. That's where we should start the recovery. Oops :-(

I haven't tested this, so, am I missing something that makes that not happen?

  Heikki Linnakangas

Sent via pgsql-patches mailing list (
To make changes to your subscription:

Reply via email to