> On 26 Jan 2017, at 10:34, Michael Paquier <michael.paqu...@gmail.com> wrote: > > On Thu, Jan 26, 2017 at 4:09 PM, Nikhil Sontakke > <nikh...@2ndquadrant.com> wrote: >>> I look at this patch from you and that's present for me: >>> https://www.postgresql.org/message-id/CAMGcDxf8Bn9ZPBBJZba9wiyQq->Qk5uqq=vjomnrnw5s+fks...@mail.gmail.com >> >>> --- a/src/backend/access/transam/xlog.c >>> +++ b/src/backend/access/transam/xlog.c >>> @@ -9573,6 +9573,7 @@ xlog_redo(XLogReaderState *record) >>> (errmsg("unexpected timeline ID %u (should be %u) >>> in checkpoint record", >>> checkPoint.ThisTimeLineID, ThisTimeLineID))); >>> >>> + KnownPreparedRecreateFiles(checkPoint.redo); >>> RecoveryRestartPoint(&checkPoint); >>> } >> >> Oh, sorry. I was asking about CheckpointTwoPhase(). I don't see a >> function by this name. And now I see, the name is CheckPointTwoPhase() >> :-) > > My mistake then :D > >>> And actually, when a XLOG_CHECKPOINT_SHUTDOWN record is taken, 2PC >>> files are not flushed to disk with this patch. This is a problem as a >>> new restart point is created... Having the flush in CheckpointTwoPhase >>> really makes the most sense. >> >> Umm, AFAICS, CheckPointTwoPhase() does not get called in the "standby >> promote" code path. > > CreateRestartPoint() calls it via CheckPointGuts() while in recovery. >
Huh, glad that this tread received a lot of attention. > On 24 Jan 2017, at 17:26, Nikhil Sontakke <nikh...@2ndquadrant.com> wrote: > > We are talking about the recovery/promote code path. Specifically this > call to KnownPreparedRecreateFiles() in PrescanPreparedTransactions(). > > We write the files to disk and they get immediately read up in the > following code. We could not write the files to disk and read > KnownPreparedList in the code path that follows as well as elsewhere. Thanks Nikhil, now I got that. Since we are talking about promotion we are on different timescale and 1-10 second lag matters a lot. I think I have in my mind realistic scenario when proposed recovery code path will hit the worst case: Google cloud. They have quite fast storage, but fsync time is really big and can go up to 10-100ms (i suppose it is network-attacheble). Having say 300 prepared tx, we can delay promotion up to half minute. So i think it worth of examination. -- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers