Re: [PATCHES] Verified fix for Bug 4137

Heikki Linnakangas Tue, 06 May 2008 04:05:56 -0700

Simon Riggs wrote:

The problem was that at the very start of archive recovery the %r
parameter in restore_command could be set to a filename later than the
currently requested filename (%f). This could lead to early truncation
of the archived WAL files and would cause warm standby replication to
fail soon afterwards, in certain specific circumstances.


Fix applied to both core server in generating correct %r filenames and
also to pg_standby to prevent acceptance of out-of-sequence filenames.

So the core problem is that we use ControlFile->checkPointCopy.redo inRestoreArchivedFile to determine the safe truncation point, but whenthere's a backup label file, that's still coming from pg_control file,which is wrong.

The patch fixes that by determining the safe truncation point asMin(checkPointCopy.redo, xlogfname), where xlogfname is the xlog filebeing restored. That depends on the assumption that everything beforethe first file we (try to) restore is safe to truncate. IOW, we nevertry to restore file B first, and then A, where A < B.

I'm not totally convinced that's a safe assumption. As an example,consider doing an archive recovery, but without a backup label, and thelatest checkpoint record is broken. We would try to read the latest(broken) checkpoint record first, and call RestoreArchivedFile to getthe xlog file containing that. But because that record is broken, wefall back to using the previous checkpoint, and will need the xlog filewhere the previous checkpoint record is in.

That's a pretty contrived example, but the point is that assumptionfeels fragile. At the very least it should be noted explicitly in thecomments. A less fragile approach would be to use something dummy, like"000000000000000000000000" as the truncation point until we'vesuccessfully read the checkpoint/restartpoint record and started the replay.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Verified fix for Bug 4137

Reply via email to