Re: [PATCHES] Verified fix for Bug 4137

Simon Riggs Tue, 06 May 2008 04:22:00 -0700

On Tue, 2008-05-06 at 12:02 +0100, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > The problem was that at the very start of archive recovery the %r
> > parameter in restore_command could be set to a filename later than the
> > currently requested filename (%f). This could lead to early truncation
> > of the archived WAL files and would cause warm standby replication to
> > fail soon afterwards, in certain specific circumstances.
> > 
> > Fix applied to both core server in generating correct %r filenames and
> > also to pg_standby to prevent acceptance of out-of-sequence filenames.
> 
> So the core problem is that we use ControlFile->checkPointCopy.redo in 
> RestoreArchivedFile to determine the safe truncation point, but when 
> there's a backup label file, that's still coming from pg_control file, 
> which is wrong.
> 
> The patch fixes that by determining the safe truncation point as 
> Min(checkPointCopy.redo, xlogfname), where xlogfname is the xlog file 
> being restored. That depends on the assumption that everything before 
> the first file we (try to) restore is safe to truncate. IOW, we never 
> try to restore file B first, and then A, where A < B.
> 
> I'm not totally convinced that's a safe assumption. As an example, 
> consider doing an archive recovery, but without a backup label, and the 
> latest checkpoint record is broken. We would try to read the latest 
> (broken) checkpoint record first, and call RestoreArchivedFile to get 
> the xlog file containing that. But because that record is broken, we 
> fall back to using the previous checkpoint, and will need the xlog file 
> where the previous checkpoint record is in.
> 
> That's a pretty contrived example, but the point is that assumption 
> feels fragile. At the very least it should be noted explicitly in the 
> comments. A less fragile approach would be to use something dummy, like 
> "000000000000000000000000" as the truncation point until we've 
> successfully read the checkpoint/restartpoint record and started the replay.


Your comments are interesting and I'll think through that some more to
see if I can see if that leads to bugs. Will talk more later.

The only valid starting place, in all cases, is the checkpoint written
by pg_start_backup(). The user doesn't need to have been archiving files
prior to that so could legitimately have had a null archive_command.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Verified fix for Bug 4137

Reply via email to