On Fri, Sep 24, 2021 at 5:07 PM Amul Sul <sula...@gmail.com> wrote: > > On Thu, Sep 23, 2021 at 11:56 PM Robert Haas <robertmh...@gmail.com> wrote: > > > > On Mon, Sep 20, 2021 at 11:20 AM Amul Sul <sula...@gmail.com> wrote: > > > Ok, understood, I have separated my changes into 0001 and 0002 patch, > > > and the refactoring patches start from 0003. > > > > I think it would be better in the other order, with the refactoring > > patches at the beginning of the series. > > > > Ok, will do that. I did this other way to minimize the diff e.g. > deletion diff of RecoveryXlogAction enum and > DetermineRecoveryXlogAction(), etc. >
I have reversed the patch order. Now refactoring patches will be first, and the patch that removes the dependencies on global & local variables will be the last. I did the necessary modification in the refactoring patches too e.g. removed DetermineRecoveryXlogAction() and RecoveryXlogAction enum which is no longer needed (thanks to commit # 1d919de5eb3fffa7cc9479ed6d2915fb89794459 to make code simple). To find the value of InRecovery after we clear it, patch still uses ControlFile's DBState, but now the check condition changed to a more specific one which is less confusing. In casual off-list discussion, the point was made to check SharedRecoveryState to find out the InRecovery value afterward, and check that using RecoveryInProgress(). But we can't depend on SharedRecoveryState because at the start it gets initialized to RECOVERY_STATE_CRASH irrespective of InRecovery that happens later. Therefore, we can't use RecoveryInProgress() which always returns true if SharedRecoveryState != RECOVERY_STATE_DONE. I am posting only refactoring patches for now. Regards, Amul
From 730e8331fefc882b4cab7112adf0f4d8da1ea831 Mon Sep 17 00:00:00 2001 From: Amul Sul <amul.sul@enterprisedb.com> Date: Thu, 30 Sep 2021 06:29:06 -0400 Subject: [PATCH v36 4/4] Remove dependencies on startup-process specifical variables. To make XLogAcceptWrites(), need to dependency on few global and local variable spcific to startup process. Global variables are ArchiveRecoveryRequested and LocalPromoteIsTriggered, whereas LocalPromoteIsTriggered can be accessed in any other process using existing PromoteIsTriggered(). ArchiveRecoveryRequested is made accessible by copying into shared memory. XLogAcceptWrites() accepts two argument as EndOfLogTLI and EndOfLog which are local to StartupXLOG(). Instead of passing as an argument XLogCtl->replayEndTLI and XLogCtl->lastSegSwitchLSN from the shared memory can be used as an replacement to EndOfLogTLI and EndOfLog respectively. XLogCtl->lastSegSwitchLSN is not going to change until we use it. That changes only when the current WAL segment gets full which never going to happen because of two reasons, first WAL writes are disabled for other processes until XLogAcceptWrites() finishes and other reasons before use of lastSegSwitchLSN, XLogAcceptWrites() is writes fix size wal records as full-page write and record for either recovery end or checkpoint which not going to fill up the 16MB wal segment. EndOfLogTLI in the StartupXLOG() is the timeline ID of the last record that xlogreader reads, but this xlogreader was simply re-fetching the last record which we have replied in redo loop if it was in recovery, if not in recovery, we don't need to worry since this value is needed only in case of ArchiveRecoveryRequested = true, which implicitly forces redo and sets XLogCtl->replayEndTLI value. --- src/backend/access/transam/xlog.c | 36 ++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 91cdd7d9ff2..5b4e5ac379f 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -659,6 +659,13 @@ typedef struct XLogCtlData */ bool SharedPromoteIsTriggered; + /* + * SharedArchiveRecoveryRequested exports the value of the + * ArchiveRecoveryRequested flag to be share which is otherwise valid only + * in the startup process. + */ + bool SharedArchiveRecoveryRequested; + /* * WalWriterSleeping indicates whether the WAL writer is currently in * low-power mode (and hence should be nudged if an async commit occurs). @@ -880,8 +887,7 @@ static MemoryContext walDebugCxt = NULL; static void readRecoverySignalFile(void); static void validateRecoveryParameters(void); static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog); -static void CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, - XLogRecPtr EndOfLog); +static void CleanupAfterArchiveRecovery(void); static bool recoveryStopsBefore(XLogReaderState *record); static bool recoveryStopsAfter(XLogReaderState *record); static char *getRecoveryStopReason(void); @@ -927,7 +933,7 @@ static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force); static XLogRecord *ReadRecord(XLogReaderState *xlogreader, int emode, bool fetching_ckpt); static void CheckRecoveryConsistency(void); -static bool XLogAcceptWrites(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog); +static bool XLogAcceptWrites(void); static bool PerformRecoveryXLogAction(void); static XLogRecord *ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int whichChkpt, bool report); @@ -5230,6 +5236,7 @@ XLOGShmemInit(void) XLogCtl->SharedHotStandbyActive = false; XLogCtl->InstallXLogFileSegmentActive = false; XLogCtl->SharedPromoteIsTriggered = false; + XLogCtl->SharedArchiveRecoveryRequested = false; XLogCtl->WalWriterSleeping = false; SpinLockInit(&XLogCtl->Insert.insertpos_lck); @@ -5511,6 +5518,11 @@ readRecoverySignalFile(void) ereport(FATAL, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("standby mode is not supported by single-user servers"))); + + /* + * Remember archive recovery request in shared memory state. + */ + XLogCtl->SharedArchiveRecoveryRequested = ArchiveRecoveryRequested; } static void @@ -5702,8 +5714,10 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) * Perform cleanup actions at the conclusion of archive recovery. */ static void -CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) +CleanupAfterArchiveRecovery(void) { + XLogRecPtr EndOfLog; + /* * Execute the recovery_end_command, if any. */ @@ -5720,6 +5734,7 @@ CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) * files containing garbage. In any case, they are not part of the new * timeline's history so we don't need them. */ + (void) GetLastSegSwitchData(&EndOfLog); RemoveNonParentXlogFiles(EndOfLog, ThisTimeLineID); /* @@ -5754,6 +5769,7 @@ CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) { char origfname[MAXFNAMELEN]; XLogSegNo endLogSegNo; + TimeLineID EndOfLogTLI = XLogCtl->replayEndTLI; XLByteToPrevSeg(EndOfLog, endLogSegNo, wal_segment_size); XLogFileName(origfname, EndOfLogTLI, endLogSegNo, wal_segment_size); @@ -8023,7 +8039,7 @@ StartupXLOG(void) Insert->fullPageWrites = lastFullPageWrites; /* Prepare to accept WAL writes. */ - promoted = XLogAcceptWrites(EndOfLogTLI, EndOfLog); + promoted = XLogAcceptWrites(); /* * If there were cascading standby servers connected to us, nudge any wal @@ -8045,7 +8061,7 @@ StartupXLOG(void) * Prepare to accept WAL writes. */ static bool -XLogAcceptWrites(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) +XLogAcceptWrites(void) { bool promoted = false; @@ -8063,8 +8079,8 @@ XLogAcceptWrites(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) promoted = PerformRecoveryXLogAction(); /* If this is archive recovery, perform post-recovery cleanup actions. */ - if (ArchiveRecoveryRequested) - CleanupAfterArchiveRecovery(EndOfLogTLI, EndOfLog); + if (XLogCtl->SharedArchiveRecoveryRequested) + CleanupAfterArchiveRecovery(); /* * If any of the critical GUCs have changed, log them before we allow @@ -8232,8 +8248,8 @@ PerformRecoveryXLogAction(void) * a full checkpoint. A checkpoint is requested later, after we're fully out * of recovery mode and already accepting queries. */ - if (ArchiveRecoveryRequested && IsUnderPostmaster && - LocalPromoteIsTriggered) + if (XLogCtl->SharedArchiveRecoveryRequested && IsUnderPostmaster && + PromoteIsTriggered()) { promoted = true; -- 2.18.0
From 76587e09ce6b7811ff940e2e65051cb49e7c16e6 Mon Sep 17 00:00:00 2001 From: Robert Haas <rhaas@postgresql.org> Date: Fri, 23 Jul 2021 15:37:53 -0400 Subject: [PATCH v36 3/4] Create XLogAcceptWrites() function with code from StartupXLOG(). This is just code movement. A future patch will want to defer the call to XLogAcceptWrites() until a later time, rather than doing it as soon as we finish applying WAL, but here we're just grouping related code together into a new function. Robert Haas, with modifications by Amul Sul. --- src/backend/access/transam/xlog.c | 52 ++++++++++++++++++++++--------- 1 file changed, 37 insertions(+), 15 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 93849d8f29a..91cdd7d9ff2 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -927,6 +927,7 @@ static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force); static XLogRecord *ReadRecord(XLogReaderState *xlogreader, int emode, bool fetching_ckpt); static void CheckRecoveryConsistency(void); +static bool XLogAcceptWrites(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog); static bool PerformRecoveryXLogAction(void); static XLogRecord *ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int whichChkpt, bool report); @@ -8014,7 +8015,41 @@ StartupXLOG(void) * record before resource manager writes cleanup WAL records or checkpoint * record is written. */ - Insert->fullPageWrites = lastFullPageWrites; + /* + * Update full_page_writes in shared memory, and later whenever wal write + * permitted, write an XLOG_FPW_CHANGE record before resource manager + * writes cleanup WAL records or checkpoint record is written. + */ + Insert->fullPageWrites = lastFullPageWrites; + + /* Prepare to accept WAL writes. */ + promoted = XLogAcceptWrites(EndOfLogTLI, EndOfLog); + + /* + * If there were cascading standby servers connected to us, nudge any wal + * sender processes to notice that we've been promoted. + */ + WalSndWakeup(); + + /* + * If this was a promotion, request an (online) checkpoint now. This isn't + * required for consistency, but the last restartpoint might be far back, + * and in case of a crash, recovering from it might take a longer than is + * appropriate now that we're not in standby mode anymore. + */ + if (promoted) + RequestCheckpoint(CHECKPOINT_FORCE); +} + +/* + * Prepare to accept WAL writes. + */ +static bool +XLogAcceptWrites(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) +{ + bool promoted = false; + + /* Write an XLOG_FPW_CHANGE record */ LocalSetXLogInsertAllowed(); UpdateFullPageWrites(); LocalXLogInsertAllowed = -1; @@ -8070,20 +8105,7 @@ StartupXLOG(void) UpdateControlFile(); LWLockRelease(ControlFileLock); - /* - * If there were cascading standby servers connected to us, nudge any wal - * sender processes to notice that we've been promoted. - */ - WalSndWakeup(); - - /* - * If this was a promotion, request an (online) checkpoint now. This isn't - * required for consistency, but the last restartpoint might be far back, - * and in case of a crash, recovering from it might take a longer than is - * appropriate now that we're not in standby mode anymore. - */ - if (promoted) - RequestCheckpoint(CHECKPOINT_FORCE); + return promoted; } /* -- 2.18.0
From 1a14516bfca72febbc3e70f7d25398c0f074c3d8 Mon Sep 17 00:00:00 2001 From: Robert Haas <rhaas@postgresql.org> Date: Fri, 23 Jul 2021 13:07:56 -0400 Subject: [PATCH v36 1/4] Refactor some end-of-recovery code out of StartupXLOG(). Moved the code that performs whether to write a checkpoint or an end-of-recovery record into PerformRecoveryXlogAction(). Also create a new function CleanupAfterArchiveRecovery() to perform a few tasks that we want to do after we've actually exited archive recovery but before we start accepting new WAL writes. This is straightforward code movement to make StartupXLOG() a little bit shorter and a little bit easier to understand. Robert Haas, with modifications by Amul Sul. --- src/backend/access/transam/xlog.c | 261 ++++++++++++++++-------------- 1 file changed, 143 insertions(+), 118 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e51a7a749da..397f7d486a6 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -880,6 +880,8 @@ static MemoryContext walDebugCxt = NULL; static void readRecoverySignalFile(void); static void validateRecoveryParameters(void); static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog); +static void CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, + XLogRecPtr EndOfLog); static bool recoveryStopsBefore(XLogReaderState *record); static bool recoveryStopsAfter(XLogReaderState *record); static char *getRecoveryStopReason(void); @@ -925,6 +927,7 @@ static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force); static XLogRecord *ReadRecord(XLogReaderState *xlogreader, int emode, bool fetching_ckpt); static void CheckRecoveryConsistency(void); +static bool PerformRecoveryXLogAction(void); static XLogRecord *ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int whichChkpt, bool report); static bool rescanLatestTimeLine(void); @@ -5694,6 +5697,88 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) (errmsg("archive recovery complete"))); } +/* + * Perform cleanup actions at the conclusion of archive recovery. + */ +static void +CleanupAfterArchiveRecovery(TimeLineID EndOfLogTLI, XLogRecPtr EndOfLog) +{ + /* + * Execute the recovery_end_command, if any. + */ + if (recoveryEndCommand && strcmp(recoveryEndCommand, "") != 0) + ExecuteRecoveryCommand(recoveryEndCommand, + "recovery_end_command", + true); + + /* + * We switched to a new timeline. Clean up segments on the old timeline. + * + * If there are any higher-numbered segments on the old timeline, remove + * them. They might contain valid WAL, but they might also be pre-allocated + * files containing garbage. In any case, they are not part of the new + * timeline's history so we don't need them. + */ + RemoveNonParentXlogFiles(EndOfLog, ThisTimeLineID); + + /* + * If the switch happened in the middle of a segment, what to do with the + * last, partial segment on the old timeline? If we don't archive it, and + * the server that created the WAL never archives it either (e.g. because it + * was hit by a meteor), it will never make it to the archive. That's OK + * from our point of view, because the new segment that we created with the + * new TLI contains all the WAL from the old timeline up to the switch + * point. But if you later try to do PITR to the "missing" WAL on the old + * timeline, recovery won't find it in the archive. It's physically present + * in the new file with new TLI, but recovery won't look there when it's + * recovering to the older timeline. On the other hand, if we archive the + * partial segment, and the original server on that timeline is still + * running and archives the completed version of the same segment later, it + * will fail. (We used to do that in 9.4 and below, and it caused such + * problems). + * + * As a compromise, we rename the last segment with the .partial suffix, and + * archive it. Archive recovery will never try to read .partial segments, so + * they will normally go unused. But in the odd PITR case, the administrator + * can copy them manually to the pg_wal directory (removing the suffix). + * They can be useful in debugging, too. + * + * If a .done or .ready file already exists for the old timeline, however, + * we had already determined that the segment is complete, so we can let it + * be archived normally. (In particular, if it was restored from the archive + * to begin with, it's expected to have a .done file). + */ + if (XLogSegmentOffset(EndOfLog, wal_segment_size) != 0 && + XLogArchivingActive()) + { + char origfname[MAXFNAMELEN]; + XLogSegNo endLogSegNo; + + XLByteToPrevSeg(EndOfLog, endLogSegNo, wal_segment_size); + XLogFileName(origfname, EndOfLogTLI, endLogSegNo, wal_segment_size); + + if (!XLogArchiveIsReadyOrDone(origfname)) + { + char origpath[MAXPGPATH]; + char partialfname[MAXFNAMELEN]; + char partialpath[MAXPGPATH]; + + XLogFilePath(origpath, EndOfLogTLI, endLogSegNo, wal_segment_size); + snprintf(partialfname, MAXFNAMELEN, "%s.partial", origfname); + snprintf(partialpath, MAXPGPATH, "%s.partial", origpath); + + /* + * Make sure there's no .done or .ready file for the .partial + * file. + */ + XLogArchiveCleanup(partialfname); + + durable_rename(origpath, partialpath, ERROR); + XLogArchiveNotify(partialfname); + } + } +} + /* * Extract timestamp from WAL record. * @@ -7883,127 +7968,13 @@ StartupXLOG(void) UpdateFullPageWrites(); LocalXLogInsertAllowed = -1; + /* Emit checkpoint or end-of-recovery record in XLOG, if required. */ if (InRecovery) - { - /* - * Perform a checkpoint to update all our recovery activity to disk. - * - * Note that we write a shutdown checkpoint rather than an on-line - * one. This is not particularly critical, but since we may be - * assigning a new TLI, using a shutdown checkpoint allows us to have - * the rule that TLI only changes in shutdown checkpoints, which - * allows some extra error checking in xlog_redo. - * - * In promotion, only create a lightweight end-of-recovery record - * instead of a full checkpoint. A checkpoint is requested later, - * after we're fully out of recovery mode and already accepting - * queries. - */ - if (ArchiveRecoveryRequested && IsUnderPostmaster && - LocalPromoteIsTriggered) - { - promoted = true; - - /* - * Insert a special WAL record to mark the end of recovery, since - * we aren't doing a checkpoint. That means that the checkpointer - * process may likely be in the middle of a time-smoothed - * restartpoint and could continue to be for minutes after this. - * That sounds strange, but the effect is roughly the same and it - * would be stranger to try to come out of the restartpoint and - * then checkpoint. We request a checkpoint later anyway, just for - * safety. - */ - CreateEndOfRecoveryRecord(); - } - else - { - RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY | - CHECKPOINT_IMMEDIATE | - CHECKPOINT_WAIT); - } - } + promoted = PerformRecoveryXLogAction(); + /* If this is archive recovery, perform post-recovery cleanup actions. */ if (ArchiveRecoveryRequested) - { - /* - * And finally, execute the recovery_end_command, if any. - */ - if (recoveryEndCommand && strcmp(recoveryEndCommand, "") != 0) - ExecuteRecoveryCommand(recoveryEndCommand, - "recovery_end_command", - true); - - /* - * We switched to a new timeline. Clean up segments on the old - * timeline. - * - * If there are any higher-numbered segments on the old timeline, - * remove them. They might contain valid WAL, but they might also be - * pre-allocated files containing garbage. In any case, they are not - * part of the new timeline's history so we don't need them. - */ - RemoveNonParentXlogFiles(EndOfLog, ThisTimeLineID); - - /* - * If the switch happened in the middle of a segment, what to do with - * the last, partial segment on the old timeline? If we don't archive - * it, and the server that created the WAL never archives it either - * (e.g. because it was hit by a meteor), it will never make it to the - * archive. That's OK from our point of view, because the new segment - * that we created with the new TLI contains all the WAL from the old - * timeline up to the switch point. But if you later try to do PITR to - * the "missing" WAL on the old timeline, recovery won't find it in - * the archive. It's physically present in the new file with new TLI, - * but recovery won't look there when it's recovering to the older - * timeline. On the other hand, if we archive the partial segment, and - * the original server on that timeline is still running and archives - * the completed version of the same segment later, it will fail. (We - * used to do that in 9.4 and below, and it caused such problems). - * - * As a compromise, we rename the last segment with the .partial - * suffix, and archive it. Archive recovery will never try to read - * .partial segments, so they will normally go unused. But in the odd - * PITR case, the administrator can copy them manually to the pg_wal - * directory (removing the suffix). They can be useful in debugging, - * too. - * - * If a .done or .ready file already exists for the old timeline, - * however, we had already determined that the segment is complete, so - * we can let it be archived normally. (In particular, if it was - * restored from the archive to begin with, it's expected to have a - * .done file). - */ - if (XLogSegmentOffset(EndOfLog, wal_segment_size) != 0 && - XLogArchivingActive()) - { - char origfname[MAXFNAMELEN]; - XLogSegNo endLogSegNo; - - XLByteToPrevSeg(EndOfLog, endLogSegNo, wal_segment_size); - XLogFileName(origfname, EndOfLogTLI, endLogSegNo, wal_segment_size); - - if (!XLogArchiveIsReadyOrDone(origfname)) - { - char origpath[MAXPGPATH]; - char partialfname[MAXFNAMELEN]; - char partialpath[MAXPGPATH]; - - XLogFilePath(origpath, EndOfLogTLI, endLogSegNo, wal_segment_size); - snprintf(partialfname, MAXFNAMELEN, "%s.partial", origfname); - snprintf(partialpath, MAXPGPATH, "%s.partial", origpath); - - /* - * Make sure there's no .done or .ready file for the .partial - * file. - */ - XLogArchiveCleanup(partialfname); - - durable_rename(origpath, partialpath, ERROR); - XLogArchiveNotify(partialfname); - } - } - } + CleanupAfterArchiveRecovery(EndOfLogTLI, EndOfLog); /* * Preallocate additional log files, if wanted. @@ -8207,6 +8178,60 @@ CheckRecoveryConsistency(void) } } +/* + * Perform whatever XLOG actions are necessary at end of REDO. + * + * The goal here is to make sure that we'll be able to recover properly if + * we crash again. If we choose to write a checkpoint, we'll write a shutdown + * checkpoint rather than an on-line one. This is not particularly critical, + * but since we may be assigning a new TLI, using a shutdown checkpoint allows + * us to have the rule that TLI only changes in shutdown checkpoints, which + * allows some extra error checking in xlog_redo. + */ +static bool +PerformRecoveryXLogAction(void) +{ + bool promoted = false; + + /* + * Perform a checkpoint to update all our recovery activity to disk. + * + * Note that we write a shutdown checkpoint rather than an on-line one. This + * is not particularly critical, but since we may be assigning a new TLI, + * using a shutdown checkpoint allows us to have the rule that TLI only + * changes in shutdown checkpoints, which allows some extra error checking + * in xlog_redo. + * + * In promotion, only create a lightweight end-of-recovery record instead of + * a full checkpoint. A checkpoint is requested later, after we're fully out + * of recovery mode and already accepting queries. + */ + if (ArchiveRecoveryRequested && IsUnderPostmaster && + LocalPromoteIsTriggered) + { + promoted = true; + + /* + * Insert a special WAL record to mark the end of recovery, since we + * aren't doing a checkpoint. That means that the checkpointer process + * may likely be in the middle of a time-smoothed restartpoint and could + * continue to be for minutes after this. That sounds strange, but the + * effect is roughly the same and it would be stranger to try to come + * out of the restartpoint and then checkpoint. We request a checkpoint + * later anyway, just for safety. + */ + CreateEndOfRecoveryRecord(); + } + else + { + RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY | + CHECKPOINT_IMMEDIATE | + CHECKPOINT_WAIT); + } + + return promoted; +} + /* * Is the system still in recovery? * -- 2.18.0
From bad3d0db320f68b083311578b1b17ff8cd1714c6 Mon Sep 17 00:00:00 2001 From: Robert Haas <rhaas@postgresql.org> Date: Fri, 23 Jul 2021 14:27:51 -0400 Subject: [PATCH v36 2/4] Postpone some end-of-recovery operations relating to allowing WAL. Previously, moved the code that performs whether to write a checkpoint or an end-of-recovery record into PerformRecoveryXlogAction(), and code performs post-archive-recovery into CleanupAfterArchiveRecovery(), but called both the functions from the same place. Now postpone that stuff until after we clear InRecovery and shut down the XLogReader. We can find out of InRecovery value afterward by looking ControlFile's DBState is needed to decide PerformRecoveryXlogAction(). This is preparatory work for a future patch that wants to allow recovery to end at one time and only later start to allow WAL writes. The steps that themselves write WAL clearly shouldn't happen before we're ready to accept WAL writes, and it seems best for now to keep the steps performed by CleanupAfterArchiveRecovery() at the same point relative to the surrounding steps. We assume (hopefully correctly) that the user doesn't want recovery_end_command to run until we're committed to writing WAL on the new timeline. Until then, the machine is still usable as a standby on the old timeline. Aside from the value of this patch as preparatory work, this order of operations actually seems more logical, since it means we don't actually write any WAL until after exiting recovery. Robert Haas, with modifications by Amul Sul. --- src/backend/access/transam/xlog.c | 40 +++++++++++++++++-------------- 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 397f7d486a6..93849d8f29a 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7958,24 +7958,6 @@ StartupXLOG(void) XLogCtl->LogwrtRqst.Write = EndOfLog; XLogCtl->LogwrtRqst.Flush = EndOfLog; - /* - * Update full_page_writes in shared memory and write an XLOG_FPW_CHANGE - * record before resource manager writes cleanup WAL records or checkpoint - * record is written. - */ - Insert->fullPageWrites = lastFullPageWrites; - LocalSetXLogInsertAllowed(); - UpdateFullPageWrites(); - LocalXLogInsertAllowed = -1; - - /* Emit checkpoint or end-of-recovery record in XLOG, if required. */ - if (InRecovery) - promoted = PerformRecoveryXLogAction(); - - /* If this is archive recovery, perform post-recovery cleanup actions. */ - if (ArchiveRecoveryRequested) - CleanupAfterArchiveRecovery(EndOfLogTLI, EndOfLog); - /* * Preallocate additional log files, if wanted. */ @@ -8027,6 +8009,28 @@ StartupXLOG(void) } XLogReaderFree(xlogreader); + /* + * Update full_page_writes in shared memory and write an XLOG_FPW_CHANGE + * record before resource manager writes cleanup WAL records or checkpoint + * record is written. + */ + Insert->fullPageWrites = lastFullPageWrites; + LocalSetXLogInsertAllowed(); + UpdateFullPageWrites(); + LocalXLogInsertAllowed = -1; + + /* + * Emit checkpoint or end-of-recovery record in XLOG, if the server has been + * through the archive or the crash recovery. + */ + if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY || + ControlFile->state == DB_IN_CRASH_RECOVERY) + promoted = PerformRecoveryXLogAction(); + + /* If this is archive recovery, perform post-recovery cleanup actions. */ + if (ArchiveRecoveryRequested) + CleanupAfterArchiveRecovery(EndOfLogTLI, EndOfLog); + /* * If any of the critical GUCs have changed, log them before we allow * backends to write WAL. -- 2.18.0