On Thursday, December 06, 2012 9:35 AM Kyotaro HORIGUCHI wrote: > Hello, I have a problem with PostgreSQL 9.2 with Pacemaker. > > HA standby sometime failes to start under normal operation. > > Testing with a bare replication pair showed that the standby failes > startup recovery under the operation sequence shown below. 9.3dev too, > but 9.1 does not have this problem. This problem became apparent by the > invalid-page check of xlog, but > 9.1 also has same glitch potentially. > > After the investigation, the lag of minRecoveryPoint behind EndRecPtr in > redo loop seems to be the cause. The lag brings about repetitive redoing > of unrepeatable xlog sequences such as XLOG_HEAP2_VISIBLE -> > SMGR_TRUNCATE on the same page. So I did the same aid work as > xact_redo_commit_internal for smgr_redo. While doing this, I noticed > that > CheckRecoveryConsistency() in redo apply loop should be after redoing > the record, so moved it.
I think moving CheckRecoveryConsistency() after redo apply loop might cause a problem. As currently it is done before recoveryStopsHere() function, which can allow connections on HOTSTANDY. But now if due to some reason recovery pauses or stops due to above function, connections might not be allowed as CheckRecoveryConsistency() is not called. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers