Hello,
I came across this issue while I was tweaking a TAP test with Ashwin for
this thread [1].
We noticed that while the startup process waits for a recovery delay, it does
not respect changes to the recovery_min_apply_delay GUC. So:
// Standby is waiting on recoveryWakeupLatch with a large
recovery_min_apply_delay value
node_standby->safe_psql('postgres', 'ALTER SYSTEM SET
recovery_min_apply_delay TO 0;');
node_standby->reload;
// Standby is still waiting, it does not proceed until the old timeout
is elapsed.
I attached a small patch to fix this. It makes it so that
recovery_min_apply_delay is reconsulted in the loop to redetermine the wait
interval. This makes it similar to the loop in CheckpointerMain, where
CheckPointTimeout is consulted after interrupts are handled:
if (elapsed_secs >= CheckPointTimeout)
continue; /* no sleep for us ... */
I have also added a test to 005_replay_delay.pl.
Regards,
Soumyadeep(VMware)
[1]
https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com
From e1b93d4b3874eb0923e18dca9a9eccd453d6cd56 Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty <[email protected]>
Date: Mon, 2 Aug 2021 20:50:37 -0700
Subject: [PATCH v1 1/1] Refresh delayUntil in recovery delay loop
This ensures that the wait interval in the loop is correctly
recalculated with the updated value of recovery_min_apply_delay.
Now, if one changes the GUC while we are waiting, those changes will
take effect. Practical applications include instantly cancelling a long
wait time by setting recovery_min_apply_delay to 0. This can be useful
in tests.
---
src/backend/access/transam/xlog.c | 11 ++++++---
src/test/recovery/t/005_replay_delay.pl | 32 +++++++++++++++++++++----
2 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f84c0bb01eb..89dc759586c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6234,12 +6234,11 @@ recoveryApplyDelay(XLogReaderState *record)
if (!getRecordTimestamp(record, &xtime))
return false;
- delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
-
/*
* Exit without arming the latch if it's already past time to apply this
* record
*/
+ delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
if (msecs <= 0)
return false;
@@ -6255,7 +6254,13 @@ recoveryApplyDelay(XLogReaderState *record)
break;
/*
- * Wait for difference between GetCurrentTimestamp() and delayUntil
+ * Recalculate delayUntil as recovery_min_apply_delay could have changed
+ * while we were waiting in the loop.
+ */
+ delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
+
+ /*
+ * Wait for difference between GetCurrentTimestamp() and delayUntil.
*/
msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
delayUntil);
diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl
index 0b56380e0a7..74f821b3248 100644
--- a/src/test/recovery/t/005_replay_delay.pl
+++ b/src/test/recovery/t/005_replay_delay.pl
@@ -7,7 +7,7 @@ use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 3;
+use Test::More tests => 4;
# Initialize primary node
my $node_primary = PostgresNode->new('primary');
@@ -56,6 +56,30 @@ $node_standby->poll_query_until('postgres',
ok(time() - $primary_insert_time >= $delay,
"standby applies WAL only after replication delay");
+# Now set the delay for the next commit to some obscenely high value.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO '2h';");
+$node_standby->reload;
+
+$node_primary->safe_psql('postgres',
+ "INSERT INTO tab_int VALUES (generate_series(21, 30))");
+
+# We should not have replayed the LSN from the last insert on the standby yet,
+# even in spite of it being received and flushed eventually. In other words, we
+# should be stuck in recovery_min_apply_delay.
+my $last_insert_lsn =
+ $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
+$node_primary->wait_for_catchup('standby', 'flush', $last_insert_lsn);
+is( $node_standby->safe_psql('postgres',
+ "SELECT pg_last_wal_replay_lsn() < '$last_insert_lsn'::pg_lsn;"),
+ 't', 'standby stuck in recovery_min_apply_delay');
+
+# Clear the recovery_min_apply_delay timeout so that the wait is immediately
+# cancelled and replay can proceed.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO 0;");
+$node_standby->reload;
+
+# Now the standby should catch up.
+$node_primary->wait_for_catchup('standby');
# Check that recovery can be paused or resumed expectedly.
my $node_standby2 = PostgresNode->new('standby2');
@@ -72,7 +96,7 @@ is( $node_standby2->safe_psql(
# Request to pause recovery and wait until it's actually paused.
$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
$node_primary->safe_psql('postgres',
- "INSERT INTO tab_int VALUES (generate_series(21,30))");
+ "INSERT INTO tab_int VALUES (generate_series(31,40))");
$node_standby2->poll_query_until('postgres',
"SELECT pg_get_wal_replay_pause_state() = 'paused'")
or die "Timed out while waiting for recovery to be paused";
@@ -84,7 +108,7 @@ my $receive_lsn =
my $replay_lsn =
$node_standby2->safe_psql('postgres', "SELECT pg_last_wal_replay_lsn()");
$node_primary->safe_psql('postgres',
- "INSERT INTO tab_int VALUES (generate_series(31,40))");
+ "INSERT INTO tab_int VALUES (generate_series(41,50))");
$node_standby2->poll_query_until('postgres',
"SELECT '$receive_lsn'::pg_lsn < pg_last_wal_receive_lsn()")
or die "Timed out while waiting for new WAL to be streamed";
@@ -102,7 +126,7 @@ $node_standby2->poll_query_until('postgres',
# is triggered while recovery is paused.
$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
$node_primary->safe_psql('postgres',
- "INSERT INTO tab_int VALUES (generate_series(41,50))");
+ "INSERT INTO tab_int VALUES (generate_series(51,60))");
$node_standby2->poll_query_until('postgres',
"SELECT pg_get_wal_replay_pause_state() = 'paused'")
or die "Timed out while waiting for recovery to be paused";
--
2.25.1