On Fri, 2008-09-26 at 11:20 +0100, Simon Riggs wrote: > > After reading this for awhile, I realized that there is a rather > > fundamental problem with it: it switches into "consistent recovery" > > mode as soon as it's read WAL beyond ControlFile->minRecoveryPoint. > > In a crash recovery situation that typically is before the last > > checkpoint (if indeed it's not still zero), and what that means is > > that this patch will activate the bgwriter and start letting in > > backends instantaneously after a crash, long before we can have any > > certainty that the DB state really is consistent. > > > > In a normal crash recovery situation this would be easily fixed by > > simply not letting it go to "consistent recovery" state at all, but > > what about recovery from a restartpoint? We don't want a slave that's > > crashed once to never let backends in again. But I don't see how to > > determine that we're far enough past the restartpoint to be consistent > > again. In crash recovery we assume (without proof ;-)) that we're > > consistent once we reach the end of valid-looking WAL, but that rule > > doesn't help for a slave that's following a continuing WAL sequence. > > > > Perhaps something could be done based on noting when we have to pull in > > a WAL segment from the recovery_command, but it sounds like a pretty > > fragile assumption. > > Seems like we just say we only signal the postmaster if > InArchiveRecovery. Archive recovery from a restartpoint is still archive > recovery, so this shouldn't be a problem in the way you mention. The > presence of recovery.conf overrides all other cases.
Anticipating your possible reponses, I would add this also: There has long been an annoying hole in the PITR scheme which is the support of recovery using a crashed database. That is there to support split mirror snapshots, but it creates a loophole where we don't know the min recovery location, circumventing the care we (you!) took to put stop/start backup in place. I think we need to add a parameter to recovery.conf that people can use to specify a minRecoveryPoint iff there in no backup label file. They can work out what this should be by following this procedure, which we should document: * split mirror, so you have offline copy of crashed database * copy database away to backup * go to running database and run pg_current_xlog_insert_location() * use the value to specify recovery_min_location If they don't specify this, then bgwriter will not start and you cannot run in Hot Standby mode. Their choice, so we need not worry then about the loophole any more. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-patches mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches