Hi Kyotaro,

Thanks for the review!

On Mon, Aug 2, 2021 at 11:42 PM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:

> One comment from me.
>
> +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET 
> recovery_min_apply_delay TO 0;");
>
> It might be better do "SET reco.. TO DEFAULT" instead.
>

Sure.

> And how about adding the new test at the end of existing tests. We can
> get rid of the extra lines in the diff.

No problem. See attached v2. I didn't do that initially as the test file
looks as though it is split into two halves: one for testing
recovery_min_apply_delay and the other for testing recovery pause. So while
making this change, I added a header comment for the newly added test case.
Please take a look.

Regards,
Soumyadeep (VMware)
From 8b4ea51e1a22548fdd3f6921fe374d3a9d987a77 Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty <soumyadeep2...@gmail.com>
Date: Mon, 2 Aug 2021 20:50:37 -0700
Subject: [PATCH v2 1/1] Refresh delayUntil in recovery delay loop

This ensures that the wait interval in the loop is correctly
recalculated with the updated value of recovery_min_apply_delay.

Now, if one changes the GUC while we are waiting, those changes will
take effect. Practical applications include instantly cancelling a long
wait time by setting recovery_min_apply_delay to 0. This can be useful
in tests.
---
 src/backend/access/transam/xlog.c       | 11 ++++++---
 src/test/recovery/t/005_replay_delay.pl | 31 +++++++++++++++++++++++--
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f84c0bb01eb..89dc759586c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6234,12 +6234,11 @@ recoveryApplyDelay(XLogReaderState *record)
 	if (!getRecordTimestamp(record, &xtime))
 		return false;
 
-	delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
-
 	/*
 	 * Exit without arming the latch if it's already past time to apply this
 	 * record
 	 */
+	delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
 	msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
 	if (msecs <= 0)
 		return false;
@@ -6255,7 +6254,13 @@ recoveryApplyDelay(XLogReaderState *record)
 			break;
 
 		/*
-		 * Wait for difference between GetCurrentTimestamp() and delayUntil
+		 * Recalculate delayUntil as recovery_min_apply_delay could have changed
+		 * while we were waiting in the loop.
+		 */
+		delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
+
+		/*
+		 * Wait for difference between GetCurrentTimestamp() and delayUntil.
 		 */
 		msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
 												delayUntil);
diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl
index 0b56380e0a7..d55f1fc257f 100644
--- a/src/test/recovery/t/005_replay_delay.pl
+++ b/src/test/recovery/t/005_replay_delay.pl
@@ -7,7 +7,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 3;
+use Test::More tests => 4;
 
 # Initialize primary node
 my $node_primary = PostgresNode->new('primary');
@@ -56,7 +56,6 @@ $node_standby->poll_query_until('postgres',
 ok(time() - $primary_insert_time >= $delay,
 	"standby applies WAL only after replication delay");
 
-
 # Check that recovery can be paused or resumed expectedly.
 my $node_standby2 = PostgresNode->new('standby2');
 $node_standby2->init_from_backup($node_primary, $backup_name,
@@ -110,3 +109,31 @@ $node_standby2->poll_query_until('postgres',
 $node_standby2->promote;
 $node_standby2->poll_query_until('postgres', "SELECT NOT pg_is_in_recovery()")
   or die "Timed out while waiting for promotion to finish";
+
+# Now test to see if updates to recovery_min_apply_delay apply when a standby is
+# waiting for a recovery delay to elapse.
+
+# First, set the delay for the next commit to some obscenely high value.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO '2h';");
+$node_standby->reload;
+
+$node_primary->safe_psql('postgres',
+	"INSERT INTO tab_int VALUES (generate_series(51, 60))");
+
+# We should not have replayed the LSN from the last insert on the standby yet,
+# even though it will be received and flushed eventually. In other words, we
+# should be stuck in recovery_min_apply_delay.
+my $last_insert_lsn =
+	$node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
+$node_primary->wait_for_catchup('standby', 'flush', $last_insert_lsn);
+is( $node_standby->safe_psql('postgres',
+	"SELECT pg_last_wal_replay_lsn() < '$last_insert_lsn'::pg_lsn;"),
+	't', 'standby stuck in recovery_min_apply_delay');
+
+# Clear the recovery_min_apply_delay timeout so that the wait is immediately
+# cancelled and replay can proceed.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO DEFAULT;");
+$node_standby->reload;
+
+# Now the standby should catch up.
+$node_primary->wait_for_catchup('standby');
-- 
2.25.1

Reply via email to