On Fri, May 23, 2025 at 08:55:27PM +0530, vignesh C wrote: > This issue can be consistently reproduced by injecting a delay (e.g., > 3 seconds) in tap_sub's walsender while decoding the commit of > 'mygid'. A patch to demonstrate this behavior is provided at > 021_two_phase_test_failure_reproduce.patch. The test can be fixed by > explicitly waiting for both subscriptions to catch up before dropping > either. A patch implementing this fix is attached.
+ if (parsed->twophase_xid && strcmp(parsed->twophase_gid, "mygid") == 0 && + strcmp(NameStr(MyReplicationSlot->data.name), "tap_sub") == 0) + sleep(3); + Smart filtering to prove your point. > +# Wait for both subscribers to catchup > $node_publisher->wait_for_catchup($appname_copy); > +$node_publisher->wait_for_catchup($appname); > + > +# Make sure there are no prepared transactions on the subscriber > +$result = $node_subscriber->safe_psql('postgres', > + "SELECT count(*) FROM pg_prepared_xacts;"); > +is($result, qq(0), 'should be no prepared transactions on subscriber'); Yes, agreed that your suggested fix looks sensible with an extra check for pg_prepared_xacts on the subscriber side that can be useful for debugging. I'll take care of that, if there are no objections. -- Michael
signature.asc
Description: PGP signature