Hi Kyotaro, Thanks for the review!
On Mon, Aug 2, 2021 at 11:42 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > One comment from me. > > +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET > recovery_min_apply_delay TO 0;"); > > It might be better do "SET reco.. TO DEFAULT" instead. > Sure. > And how about adding the new test at the end of existing tests. We can > get rid of the extra lines in the diff. No problem. See attached v2. I didn't do that initially as the test file looks as though it is split into two halves: one for testing recovery_min_apply_delay and the other for testing recovery pause. So while making this change, I added a header comment for the newly added test case. Please take a look. Regards, Soumyadeep (VMware)
From 8b4ea51e1a22548fdd3f6921fe374d3a9d987a77 Mon Sep 17 00:00:00 2001 From: Soumyadeep Chakraborty <soumyadeep2...@gmail.com> Date: Mon, 2 Aug 2021 20:50:37 -0700 Subject: [PATCH v2 1/1] Refresh delayUntil in recovery delay loop This ensures that the wait interval in the loop is correctly recalculated with the updated value of recovery_min_apply_delay. Now, if one changes the GUC while we are waiting, those changes will take effect. Practical applications include instantly cancelling a long wait time by setting recovery_min_apply_delay to 0. This can be useful in tests. --- src/backend/access/transam/xlog.c | 11 ++++++--- src/test/recovery/t/005_replay_delay.pl | 31 +++++++++++++++++++++++-- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f84c0bb01eb..89dc759586c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6234,12 +6234,11 @@ recoveryApplyDelay(XLogReaderState *record) if (!getRecordTimestamp(record, &xtime)) return false; - delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); - /* * Exit without arming the latch if it's already past time to apply this * record */ + delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil); if (msecs <= 0) return false; @@ -6255,7 +6254,13 @@ recoveryApplyDelay(XLogReaderState *record) break; /* - * Wait for difference between GetCurrentTimestamp() and delayUntil + * Recalculate delayUntil as recovery_min_apply_delay could have changed + * while we were waiting in the loop. + */ + delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); + + /* + * Wait for difference between GetCurrentTimestamp() and delayUntil. */ msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil); diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl index 0b56380e0a7..d55f1fc257f 100644 --- a/src/test/recovery/t/005_replay_delay.pl +++ b/src/test/recovery/t/005_replay_delay.pl @@ -7,7 +7,7 @@ use warnings; use PostgresNode; use TestLib; -use Test::More tests => 3; +use Test::More tests => 4; # Initialize primary node my $node_primary = PostgresNode->new('primary'); @@ -56,7 +56,6 @@ $node_standby->poll_query_until('postgres', ok(time() - $primary_insert_time >= $delay, "standby applies WAL only after replication delay"); - # Check that recovery can be paused or resumed expectedly. my $node_standby2 = PostgresNode->new('standby2'); $node_standby2->init_from_backup($node_primary, $backup_name, @@ -110,3 +109,31 @@ $node_standby2->poll_query_until('postgres', $node_standby2->promote; $node_standby2->poll_query_until('postgres', "SELECT NOT pg_is_in_recovery()") or die "Timed out while waiting for promotion to finish"; + +# Now test to see if updates to recovery_min_apply_delay apply when a standby is +# waiting for a recovery delay to elapse. + +# First, set the delay for the next commit to some obscenely high value. +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO '2h';"); +$node_standby->reload; + +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(51, 60))"); + +# We should not have replayed the LSN from the last insert on the standby yet, +# even though it will be received and flushed eventually. In other words, we +# should be stuck in recovery_min_apply_delay. +my $last_insert_lsn = + $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();"); +$node_primary->wait_for_catchup('standby', 'flush', $last_insert_lsn); +is( $node_standby->safe_psql('postgres', + "SELECT pg_last_wal_replay_lsn() < '$last_insert_lsn'::pg_lsn;"), + 't', 'standby stuck in recovery_min_apply_delay'); + +# Clear the recovery_min_apply_delay timeout so that the wait is immediately +# cancelled and replay can proceed. +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO DEFAULT;"); +$node_standby->reload; + +# Now the standby should catch up. +$node_primary->wait_for_catchup('standby'); -- 2.25.1