On Tue, 2008-05-06 at 21:51 +0100, Heikki Linnakangas wrote: > In fact, what will happen if the checkpoint record's redo pointer points > to an earlier xlog file: > > 1. The location of the checkpoint record is read by read_backup_label(). > Let's say that it's 0005. > 2. ReadCheckpointRecord() is called for 0005. The restore command is > called because that xlog file is not present. The safe truncation point > is determined to be 0005, as that's what we're reading. Everything > before that is truncated > 3. The redo pointer in the checkpoint record points to 0003. That's > where we should start the recovery. Oops :-(
Yes, this case could be a problem, if the records are in different files. It's the files that matter, not the records themselves though. I've extended the patch without introducing another new status variable, which was my original concern with what you suggested previously. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Index: contrib/pg_standby/pg_standby.c =================================================================== RCS file: /home/sriggs/pg/REPOSITORY/pgsql/contrib/pg_standby/pg_standby.c,v retrieving revision 1.10 diff -c -r1.10 pg_standby.c *** contrib/pg_standby/pg_standby.c 15 Nov 2007 21:14:30 -0000 1.10 --- contrib/pg_standby/pg_standby.c 3 May 2008 11:27:12 -0000 *************** *** 297,302 **** --- 297,311 ---- if (restartWALFileName) { + /* + * Don't do cleanup if the restartWALFileName provided + * is later than the xlog file requested. This is an error + * and we must not remove these files from archive. + * This shouldn't happen, but better safe than sorry. + */ + if (strcmp(restartWALFileName, nextWALFileName) > 0) + return false; + strcpy(exclusiveCleanupFileName, restartWALFileName); return true; } *************** *** 584,590 **** fprintf(stderr, "\nMax wait interval : %d %s", maxwaittime, (maxwaittime > 0 ? "seconds" : "forever")); fprintf(stderr, "\nCommand for restore : %s", restoreCommand); ! fprintf(stderr, "\nKeep archive history : %s and later", exclusiveCleanupFileName); fflush(stderr); } --- 593,603 ---- fprintf(stderr, "\nMax wait interval : %d %s", maxwaittime, (maxwaittime > 0 ? "seconds" : "forever")); fprintf(stderr, "\nCommand for restore : %s", restoreCommand); ! fprintf(stderr, "\nKeep archive history : "); ! if (need_cleanup) ! fprintf(stderr, "%s and later", exclusiveCleanupFileName); ! else ! fprintf(stderr, "No cleanup required"); fflush(stderr); } Index: src/backend/access/transam/xlog.c =================================================================== RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.300 diff -c -r1.300 xlog.c *** src/backend/access/transam/xlog.c 24 Apr 2008 14:23:43 -0000 1.300 --- src/backend/access/transam/xlog.c 6 May 2008 23:04:30 -0000 *************** *** 2484,2489 **** --- 2484,2522 ---- } /* + * Calculate the archive file cutoff point for use during log shipping + * replication. All files earlier than this point can be deleted + * from the archive, though there is no requirement to do so. + * + * We initialise this with the filename of an InvalidXLogRecPtr, which + * will prevent the deletion of any WAL files from the archive + * because of the alphabetic sorting property of WAL filenames. + * + * Once we have successfully located the redo pointer of the checkpoint + * from which we start recovery we never request a file prior to the redo + * pointer of the last restartpoint. When redo begins we know that we + * have successfully located it, so there is no need for additional + * status flags to signify the point when we can begin deleting WAL files + * from the archive. + * + * We do the calculation now so that %r is always equal to or earlier + * than %f before we start to construct the command to be executed, as + * an additional cross-check on the sanity of our cutoff point. + */ + if (InRedo) + { + XLByteToSeg(ControlFile->checkPointCopy.redo, + restartLog, restartSeg); + XLogFileName(lastRestartPointFname, + ControlFile->checkPointCopy.ThisTimeLineID, + restartLog, restartSeg); + if (strcmp(lastRestartPointFname, xlogfname) > 0) + strcpy(lastRestartPointFname, xlogfname); + } + else + XLogFileName(lastRestartPointFname, 0, 0, 0); + + /* * construct the command to be executed */ dp = xlogRestoreCmd; *************** *** 2512,2522 **** case 'r': /* %r: filename of last restartpoint */ sp++; - XLByteToSeg(ControlFile->checkPointCopy.redo, - restartLog, restartSeg); - XLogFileName(lastRestartPointFname, - ControlFile->checkPointCopy.ThisTimeLineID, - restartLog, restartSeg); StrNCpy(dp, lastRestartPointFname, endp - dp); dp += strlen(dp); break; --- 2545,2550 ----
-- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches