On 2020-Jan-10, Alvaro Herrera wrote:

> A customer of ours complained that if you have an inactive primary,
> monitoring the apply lag on a standby reports monotonically increasing
> lag.  The reason for this is that the apply lag is only updated on
> COMMIT records, which of course don't occur in inactive servers.
> But CHECKPOINT records do occur, so the WAL insert pointer continues to
> move forward, which is what causes the spurious lag.
> 
> (I think newer releases are protected from this problem because they
> don't emit checkpoints during periods of inactivity.  I didn't verify
> this.)
> 
> This patch fixes the problem by using the checkpoint timestamp to update
> the lag tracker in the standby.  This requires a little change in where
> this update is invoked, because previously it was done only for the XACT
> rmgr; this makes the patch a little bigger than it should.

Here's a version of the patch that applies to current master.  It does
fix the problem that CHECKPOINT wal records are not considered when
determining time-of-latest-record.

However, it does *not* fix the monitoring problem I mentioned (which
relied on comparing pg_last_xact_replay_timestamp() to 'now()') ...
because commit 6ef2eba3f57f (pg10) made an idle server not emit
checkpoint records anymore.  That is, my parenthical remark was
completely wrong: the new versions not only are "protected", but also
this fix doesn't fix them.  Luckily, the way to fix monitoring for
servers of versions 10 and later is to use the new replay_lag (etc)
columns in pg_stat_replication, commit 6912acc04f0b (also pg10).

I am inclined to apply this to all branches unless there are strong
objections, because the current code seems pretty arbitrary anyway.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 35216a217af6c3452e9cd70b10963c238bcee9ff Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Mon, 27 Jan 2020 14:43:17 -0300
Subject: [PATCH v2] Use CheckPoint->time to update latest recovery timestamp
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, only timestamps carried by transaction commit/abort and
recovery labels were used, but there's no reason to ignore checkpoint's.

Also, rearrange things so that the timestamp is updated by StartupXLOG's
loop rather than recoveryStopsAfter, which was an odd choice.

Author: Álvaro Herrera
Reviewed-by:
Discussion: https://postgr.es/m/20200110140828.GA6228@alvherre.pgsql
---
 src/backend/access/transam/xlog.c | 37 ++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 882d5e8a73..0bc3d8cec5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5560,8 +5560,8 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
  *
  * If the record contains a timestamp, returns true, and saves the timestamp
  * in *recordXtime. If the record type has no timestamp, returns false.
- * Currently, only transaction commit/abort records and restore points contain
- * timestamps.
+ * Currently, only transaction commit/abort records, restore points and
+ * checkpoints contain timestamps.
  */
 static bool
 getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime)
@@ -5575,6 +5575,13 @@ getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime)
 		*recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time;
 		return true;
 	}
+	if (rmid == RM_XLOG_ID && (info == XLOG_CHECKPOINT_ONLINE ||
+							   info == XLOG_CHECKPOINT_SHUTDOWN))
+	{
+		*recordXtime =
+			time_t_to_timestamptz(((CheckPoint *) XLogRecGetData(record))->time);
+		return true;
+	}
 	if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_COMMIT ||
 							   xact_info == XLOG_XACT_COMMIT_PREPARED))
 	{
@@ -5743,9 +5750,6 @@ recoveryStopsBefore(XLogReaderState *record)
 
 /*
  * Same as recoveryStopsBefore, but called after applying the record.
- *
- * We also track the timestamp of the latest applied COMMIT/ABORT
- * record in XLogCtl->recoveryLastXTime.
  */
 static bool
 recoveryStopsAfter(XLogReaderState *record)
@@ -5753,7 +5757,6 @@ recoveryStopsAfter(XLogReaderState *record)
 	uint8		info;
 	uint8		xact_info;
 	uint8		rmid;
-	TimestampTz recordXtime;
 
 	/*
 	 * Ignore recovery target settings when not in archive recovery (meaning
@@ -5821,10 +5824,6 @@ recoveryStopsAfter(XLogReaderState *record)
 	{
 		TransactionId recordXid;
 
-		/* Update the last applied transaction timestamp */
-		if (getRecordTimestamp(record, &recordXtime))
-			SetLatestXTime(recordXtime);
-
 		/* Extract the XID of the committed/aborted transaction */
 		if (xact_info == XLOG_XACT_COMMIT_PREPARED)
 		{
@@ -5863,7 +5862,7 @@ recoveryStopsAfter(XLogReaderState *record)
 		{
 			recoveryStopAfter = true;
 			recoveryStopXid = recordXid;
-			recoveryStopTime = recordXtime;
+			getRecordTimestamp(record, &recoveryStopTime);
 			recoveryStopLSN = InvalidXLogRecPtr;
 			recoveryStopName[0] = '\0';
 
@@ -6047,6 +6046,19 @@ recoveryApplyDelay(XLogReaderState *record)
 	return true;
 }
 
+/*
+ * Track the timestamp of the latest applied time-bearing WAL record in
+ * XLogCtl->recoveryLastXTime.
+ */
+static void
+updateRecoveryXTime(XLogReaderState *record)
+{
+	TimestampTz	recordXtime;
+
+	if (getRecordTimestamp(record, &recordXtime))
+		SetLatestXTime(recordXtime);
+}
+
 /*
  * Save timestamp of latest processed commit/abort record.
  *
@@ -7255,6 +7267,9 @@ StartupXLOG(void)
 						WalSndWakeup();
 				}
 
+				/* update last recovery time */
+				updateRecoveryXTime(xlogreader);
+
 				/* Exit loop if we reached inclusive recovery target */
 				if (recoveryStopsAfter(xlogreader))
 				{
-- 
2.20.1

Reply via email to