Dear Alexander, Kuroda-san,
IIUC, the problem is that the walreceiver does not reach the main loop
before it receives a signal and this leads to the absence of the required
log on publisher. So what do think of this fix?
Added a check to the test scenario that the walreceiver on standby was
fully initialized and replication started. With this fix, I can not
reproduce failure anymore.
Regards,
Andrey Silitskiy
From 782c48d615abc961ab70cb33013561cedd6e6dac Mon Sep 17 00:00:00 2001
From: "a.silitskiy" <[email protected]>
Date: Wed, 3 Jun 2026 17:24:09 +0700
Subject: [PATCH v1] Fix walsnd_shutdown_timeout test case
---
.../subscription/t/038_walsnd_shutdown_timeout.pl | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/src/test/subscription/t/038_walsnd_shutdown_timeout.pl b/src/test/subscription/t/038_walsnd_shutdown_timeout.pl
index f4ed5d97852..ee5d3b77a23 100644
--- a/src/test/subscription/t/038_walsnd_shutdown_timeout.pl
+++ b/src/test/subscription/t/038_walsnd_shutdown_timeout.pl
@@ -164,16 +164,20 @@ $node_standby->append_conf(
hot_standby_feedback = on));
$node_standby->start;
+# Wait for replication to start
+$node_publisher->safe_psql('postgres', "INSERT INTO test_tab VALUES (-1);");
+$node_standby->poll_query_until('postgres',
+ "SELECT EXISTS (SELECT 1 FROM test_tab WHERE id = -1)");
+$node_subscriber->poll_query_until('postgres',
+ "SELECT EXISTS (SELECT 1 FROM test_tab WHERE id = -1)");
+
# Cause the logical apply worker to block on a lock by running conflicting
# transactions on the publisher and subscriber, stalling logical replication.
-$node_publisher->wait_for_catchup('test_sub');
$sub_session->query_safe("BEGIN; LOCK TABLE test_tab IN EXCLUSIVE MODE;");
-$node_publisher->safe_psql('postgres', "INSERT INTO test_tab VALUES (-1); ");
+$node_publisher->safe_psql('postgres', "INSERT INTO test_tab VALUES (-2); ");
# Cause the standby's walreceiver to be blocked with SIGSTOP signal,
# stalling physical replication.
-$node_standby->poll_query_until('postgres',
- "SELECT EXISTS(SELECT 1 FROM pg_stat_wal_receiver)");
my $receiverpid = $node_standby->safe_psql('postgres',
"SELECT pid FROM pg_stat_wal_receiver");
like($receiverpid, qr/^[0-9]+$/, "have walreceiver pid $receiverpid");
@@ -184,7 +188,7 @@ $log_offset = -s $node_publisher->logfile;
# Verify that the walsender exits due to wal_sender_shutdown_timeout
# even when both physical and logical replication are stalled.
# wal_sender_shutdown_timeout.
-$node_publisher->safe_psql('postgres', "INSERT INTO test_tab VALUES (-2);");
+$node_publisher->safe_psql('postgres', "INSERT INTO test_tab VALUES (-3);");
$node_publisher->stop('fast');
ok( $node_publisher->log_contains(
qr/WARNING: .* terminating walsender process due to replication shutdown timeout/,
--
2.34.1