On 2020-Jan-10, Alvaro Herrera wrote: > A customer of ours complained that if you have an inactive primary, > monitoring the apply lag on a standby reports monotonically increasing > lag. The reason for this is that the apply lag is only updated on > COMMIT records, which of course don't occur in inactive servers. > But CHECKPOINT records do occur, so the WAL insert pointer continues to > move forward, which is what causes the spurious lag. > > (I think newer releases are protected from this problem because they > don't emit checkpoints during periods of inactivity. I didn't verify > this.) > > This patch fixes the problem by using the checkpoint timestamp to update > the lag tracker in the standby. This requires a little change in where > this update is invoked, because previously it was done only for the XACT > rmgr; this makes the patch a little bigger than it should.
Here's a version of the patch that applies to current master. It does fix the problem that CHECKPOINT wal records are not considered when determining time-of-latest-record. However, it does *not* fix the monitoring problem I mentioned (which relied on comparing pg_last_xact_replay_timestamp() to 'now()') ... because commit 6ef2eba3f57f (pg10) made an idle server not emit checkpoint records anymore. That is, my parenthical remark was completely wrong: the new versions not only are "protected", but also this fix doesn't fix them. Luckily, the way to fix monitoring for servers of versions 10 and later is to use the new replay_lag (etc) columns in pg_stat_replication, commit 6912acc04f0b (also pg10). I am inclined to apply this to all branches unless there are strong objections, because the current code seems pretty arbitrary anyway. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 35216a217af6c3452e9cd70b10963c238bcee9ff Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Mon, 27 Jan 2020 14:43:17 -0300 Subject: [PATCH v2] Use CheckPoint->time to update latest recovery timestamp MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, only timestamps carried by transaction commit/abort and recovery labels were used, but there's no reason to ignore checkpoint's. Also, rearrange things so that the timestamp is updated by StartupXLOG's loop rather than recoveryStopsAfter, which was an odd choice. Author: Álvaro Herrera Reviewed-by: Discussion: https://postgr.es/m/20200110140828.GA6228@alvherre.pgsql --- src/backend/access/transam/xlog.c | 37 ++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 11 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 882d5e8a73..0bc3d8cec5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5560,8 +5560,8 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) * * If the record contains a timestamp, returns true, and saves the timestamp * in *recordXtime. If the record type has no timestamp, returns false. - * Currently, only transaction commit/abort records and restore points contain - * timestamps. + * Currently, only transaction commit/abort records, restore points and + * checkpoints contain timestamps. */ static bool getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime) @@ -5575,6 +5575,13 @@ getRecordTimestamp(XLogReaderState *record, TimestampTz *recordXtime) *recordXtime = ((xl_restore_point *) XLogRecGetData(record))->rp_time; return true; } + if (rmid == RM_XLOG_ID && (info == XLOG_CHECKPOINT_ONLINE || + info == XLOG_CHECKPOINT_SHUTDOWN)) + { + *recordXtime = + time_t_to_timestamptz(((CheckPoint *) XLogRecGetData(record))->time); + return true; + } if (rmid == RM_XACT_ID && (xact_info == XLOG_XACT_COMMIT || xact_info == XLOG_XACT_COMMIT_PREPARED)) { @@ -5743,9 +5750,6 @@ recoveryStopsBefore(XLogReaderState *record) /* * Same as recoveryStopsBefore, but called after applying the record. - * - * We also track the timestamp of the latest applied COMMIT/ABORT - * record in XLogCtl->recoveryLastXTime. */ static bool recoveryStopsAfter(XLogReaderState *record) @@ -5753,7 +5757,6 @@ recoveryStopsAfter(XLogReaderState *record) uint8 info; uint8 xact_info; uint8 rmid; - TimestampTz recordXtime; /* * Ignore recovery target settings when not in archive recovery (meaning @@ -5821,10 +5824,6 @@ recoveryStopsAfter(XLogReaderState *record) { TransactionId recordXid; - /* Update the last applied transaction timestamp */ - if (getRecordTimestamp(record, &recordXtime)) - SetLatestXTime(recordXtime); - /* Extract the XID of the committed/aborted transaction */ if (xact_info == XLOG_XACT_COMMIT_PREPARED) { @@ -5863,7 +5862,7 @@ recoveryStopsAfter(XLogReaderState *record) { recoveryStopAfter = true; recoveryStopXid = recordXid; - recoveryStopTime = recordXtime; + getRecordTimestamp(record, &recoveryStopTime); recoveryStopLSN = InvalidXLogRecPtr; recoveryStopName[0] = '\0'; @@ -6047,6 +6046,19 @@ recoveryApplyDelay(XLogReaderState *record) return true; } +/* + * Track the timestamp of the latest applied time-bearing WAL record in + * XLogCtl->recoveryLastXTime. + */ +static void +updateRecoveryXTime(XLogReaderState *record) +{ + TimestampTz recordXtime; + + if (getRecordTimestamp(record, &recordXtime)) + SetLatestXTime(recordXtime); +} + /* * Save timestamp of latest processed commit/abort record. * @@ -7255,6 +7267,9 @@ StartupXLOG(void) WalSndWakeup(); } + /* update last recovery time */ + updateRecoveryXTime(xlogreader); + /* Exit loop if we reached inclusive recovery target */ if (recoveryStopsAfter(xlogreader)) { -- 2.20.1