On Thu, 25 Jul 2024 at 08:39, Amit Kapila <[email protected]> wrote:
>
> On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <[email protected]> wrote:
> >
> > Amit Kapila <[email protected]> writes:
> > > I merged these changes, made a few other cosmetic changes, and pushed the
> > > patch.
> >
> > There is a CF entry pointing at this thread [1]. Should it be closed?
> >
>
> Yes, closed now. Thanks for the reminder.
I noticed one random test failure in my environment with 021_twophase test.
[10:37:01.131](0.053s) ok 24 - should be no prepared transactions on subscriber
error running SQL: 'psql:<stdin>:2: ERROR: cannot alter two_phase
when logical replication worker is still running
HINT: Try again after some time.'
We can reproduce the issue by adding a delay at apply_worker_exit like
in the attached Reproduce_random_021_twophase_test_failure.patch
patch.
This is happening because the check here is wrong:
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type =
'logical replication worker'"
Here "logical replication worker" should be "logical replication apply worker".
Attached patch has the changes for the same.
Regards,
Vignesh
From 981cb77850f6576bf4f82ddad616623a3ef27ed8 Mon Sep 17 00:00:00 2001
From: Vignesh C <[email protected]>
Date: Tue, 30 Jul 2024 15:45:04 +0530
Subject: [PATCH] Fix random failure in 021_twophase.
After disabling the subscription, the failed test was changing two_phase
option for the subscription. We missed waiting for apply worker to exit
because of a wrong check.
---
src/test/subscription/t/021_twophase.pl | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/test/subscription/t/021_twophase.pl b/src/test/subscription/t/021_twophase.pl
index a47d3b7dd6..5e50f1af33 100644
--- a/src/test/subscription/t/021_twophase.pl
+++ b/src/test/subscription/t/021_twophase.pl
@@ -385,7 +385,7 @@ is($result, qq(t), 'two-phase is enabled');
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub_copy DISABLE;");
$node_subscriber->poll_query_until('postgres',
- "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'logical replication worker'"
+ "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'logical replication apply worker'"
);
$node_subscriber->safe_psql(
'postgres', "
@@ -434,7 +434,7 @@ is($result, qq(0), 'should be no prepared transactions on subscriber');
$node_subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION tap_sub_copy DISABLE;");
$node_subscriber->poll_query_until('postgres',
- "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'logical replication worker'"
+ "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'logical replication apply worker'"
);
$node_subscriber->safe_psql(
'postgres', "
--
2.34.1
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ec96b5fe85..f49eab78c2 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3827,6 +3827,7 @@ send_feedback(XLogRecPtr recvpos, bool force, bool requestReply)
static void
apply_worker_exit(void)
{
+ sleep(1);
if (am_parallel_apply_worker())
{
/*