I was trying to understand the v1 patch and found that at the end
RequestCheckpoint() is called unconditionally, I think that should
have been called if REDO had performed, here is the snip from the v1
patch:
/*
- * If this was a promotion, request an (online) checkpoint now. This isn't
- * required for consistency, but the last restartpoint might be far back,
- * and in case of a crash, recovering from it might take a longer than is
- * appropriate now that we're not in standby mode anymore.
+ * Request an (online) checkpoint now. Note that, until this is complete,
+ * a crash would start replay from the same WAL location we did, or from
+ * the last restartpoint that completed. We don't want to let that
+ * situation persist for longer than necessary, since users don't like
+ * long recovery times. On the other hand, they also want to be able to
+ * start doing useful work again as quickly as possible. Therfore, we
+ * don't pass CHECKPOINT_IMMEDIATE to avoid bogging down the system.
+ *
+ * Note that the consequence of requesting a checkpoint here only after
+ * we've allowed WAL writes is that a single checkpoint cycle can span
+ * multiple server lifetimes. So for example if you want to something to
+ * happen at least once per checkpoint cycle or at most once per
+ * checkpoint cycle, you have to consider what happens if the server
+ * is restarted someplace in the middle.
*/
- if (promoted)
- RequestCheckpoint(CHECKPOINT_FORCE);
+ RequestCheckpoint(CHECKPOINT_FORCE);
When I try to call that conditionally like attached, I don't see any
regression failure, correct me if I am missing something here.
Regards,
Amul
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index eddb13d13a7..6d48a1047c6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6549,7 +6549,7 @@ StartupXLOG(void)
DBState dbstate_at_startup;
XLogReaderState *xlogreader;
XLogPageReadPrivate private;
- bool promoted = false;
+ bool written_end_of_recovery = false;
struct stat st;
/*
@@ -7955,45 +7955,9 @@ StartupXLOG(void)
if (InRecovery)
{
- /*
- * Perform a checkpoint to update all our recovery activity to disk.
- *
- * Note that we write a shutdown checkpoint rather than an on-line
- * one. This is not particularly critical, but since we may be
- * assigning a new TLI, using a shutdown checkpoint allows us to have
- * the rule that TLI only changes in shutdown checkpoints, which
- * allows some extra error checking in xlog_redo.
- *
- * In promotion, only create a lightweight end-of-recovery record
- * instead of a full checkpoint. A checkpoint is requested later,
- * after we're fully out of recovery mode and already accepting
- * queries.
- */
- if (ArchiveRecoveryRequested && IsUnderPostmaster &&
- LocalPromoteIsTriggered)
- {
- promoted = true;
-
- /*
- * Insert a special WAL record to mark the end of recovery, since
- * we aren't doing a checkpoint. That means that the checkpointer
- * process may likely be in the middle of a time-smoothed
- * restartpoint and could continue to be for minutes after this.
- * That sounds strange, but the effect is roughly the same and it
- * would be stranger to try to come out of the restartpoint and
- * then checkpoint. We request a checkpoint later anyway, just for
- * safety.
- */
- CreateEndOfRecoveryRecord();
- }
- else
- {
- RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY |
- CHECKPOINT_IMMEDIATE |
- CHECKPOINT_WAIT);
- }
+ CreateEndOfRecoveryRecord();
+ written_end_of_recovery = true;
}
-
if (ArchiveRecoveryRequested)
{
/*
@@ -8177,12 +8141,9 @@ StartupXLOG(void)
WalSndWakeup();
/*
- * If this was a promotion, request an (online) checkpoint now. This isn't
- * required for consistency, but the last restartpoint might be far back,
- * and in case of a crash, recovering from it might take a longer than is
- * appropriate now that we're not in standby mode anymore.
+ * TODO
*/
- if (promoted)
+ if (written_end_of_recovery)
RequestCheckpoint(CHECKPOINT_FORCE);
}