Re: [PATCHES] Verified fix for Bug 4137

Heikki Linnakangas Tue, 06 May 2008 07:00:54 -0700

Simon Riggs wrote:

Falling back to the secondary checkpoint implies we have a corrupted or
absent WAL file, so making recovery startup work correctly won't avoid
the need to re-run the base backup. We'll end with an unrecoverable
error in either case, so it doesn't seem worth attempting to improve
this in the way you suggest.

That's true whenever you have to fall back to a secondary checkpoint,but we still try to get the database up. One could argue that weshouldn't, of course.

Anyway, the point is that the patch relies on a non-obvious assumption.Even if the secondary checkpoint issue is a non-issue, it's not obvious(to me at least) that there isn't other similar scenarios. And someonemight inadvertently break the assumption in a future patch, because it'snot an obvious one; calling ReadRecord looks very innocent. We shouldn'tintroduce an assumption like that when we don't have to.

I think we should completely prevent access to secondary checkpoints
during archive recovery, because if the primary checkpoint isn't present
or is corrupt we aren't ever going to get passed it to get to the
pg_stop_backup() point. Hence an archive recovery can never be valid in
that case. I'll do a separate patch for that because they are unrelated
issues.

Well, we already don't use the secondary checkpoint if a backup labelfile is present. And you can take a base backup withoutpg_start_backup()/pg_stop_backup() if you shut down the system first (acold backup).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Verified fix for Bug 4137

Reply via email to