Re: [HACKERS] Hot standby doesn't come up on some situation.

Kyotaro HORIGUCHI Sun, 02 Mar 2014 16:28:19 -0800

Hello,

At Fri, 28 Feb 2014 14:45:58 +0200, Heikki Linnakangas 
<[email protected]> wrote in <[email protected]>
> > Yes, but the same stuation could be made by restarting crashed
> > secondary.
> 
> Yeah.
> 
> > I have no idea about the scenario on whitch this behavior was regarded
> > as
> > undesirable but anyway I think that the secondry should start
> > accepting
> > client just after crash recovery is completed.
> 
> Agreed, this is a bug.
> 
> I don't think your patch is the right fix for this though. Setting
> minRecoveryPoint to EndRecPtr is the right thing to do; EndRecPtr
> points to the end of the last read and replayed record. What's wrong
> in this case is lastReplayedEndRecptr. At the beginning of recovery,
> it's initialized to the REDO point, but with a shutdown checkpoint,
> that's not quite right. When starting from a shutdown checkpoint, REDO
> points to the beginning of the shutdown record, but we've already
> effectively replayed it. The next record we replay is the one after
> the checkpoint.


It's more reasonable. I felt uncelar about that but I forgot to
doubt the correctness of lastReplayedEndRecptr then, but surely
the shutdown record itself was effectively alredy replayed when
the recored is inseretd.

> To see that, I added some elog(LOG) calls:
> 
> ~/pgsql.93stable$ bin/postmaster -D data
> LOG:  database system was shut down at 2014-02-28 14:06:18 EET
> LOG:  ReadCheckpointRecord: 0/16479C98
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
> ^CLOG:  received fast shutdown request
> LOG:  aborting any active transactions
> LOG:  autovacuum launcher shutting down
> LOG:  shutting down
> LOG: INSERT @ 0/16479D00: prev 0/16479C98; xid 0; len 72: XLOG -
> checkpoint: redo 0/16479D00; tli 1; prev tli 1; fpw true; xid
> 0/793393; oid 24988; multi 655288; offset 1356722; oldest xid 687 in
> DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown
> LOG:  xlog flush request 0/16479D68; write 0/0; flush 0/0
> LOG:  database system is shut down
> ~/pgsql.93stable$ bin/postmaster -D data
> LOG:  database system was shut down at 2014-02-28 14:06:23 EET
> LOG:  ReadCheckpointRecord: 0/16479D00
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
> Killed
> 
> At this point, the last record is the shutdown checkpoint, beginning
> at 16479D00, and the server has been killed (immediate shutdown).
> 
> ~/pgsql.93stable$ cp recovery.conf data/recovery.conf
> ~/pgsql.93stable$ bin/postmaster -D data
> LOG: database system was interrupted; last known up at 2014-02-28
> 14:06:29 EET
> LOG:  entering standby mode
> LOG:  ReadCheckpointRecord: 0/16479D00
> LOG: database system was not properly shut down; automatic recovery in
> progress
> LOG:  record with zero length at 0/16479D68
> LOG:  reached end of WAL in pg_xlog, entering archive recovery
> LOG:  EndRecPtr: 0/16479D68 lastReplayedEndRecPtr: 0/16479D00
> FATAL: could not connect to the primary server: could not connect to
> server: Connection refused
> ...
> 
> Recovery starts from the checkpoint record, but lastReplayedEndRecPtr
> is set to the *beginning* of the checkpoint record, even though the
> checkpoint record has already been effectively replayed, by the feat
> of starting recovery from it. EndRecPtr correctly points to the end of
> the checkpoint record. Because of the incorrect lastReplayedEndRecPtr
> value, the CheckRecoveryConsistency() call concludes that it's not
> consistent.

I completely understood the behavior thanks to your detailed
explanation. (And how to use log messages effectively :-)

I agree that the fix is appropriate.

> I believe the attached fix is the right way to fix this.

It also worked for me. Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby doesn't come up on some situation.

Reply via email to