On Mon, Jul 7, 2025 at 8:15 PM Ajin Cherian <itsa...@gmail.com> wrote: > > On Sun, Jul 6, 2025 at 2:00 AM Alexander Lakhin <exclus...@gmail.com> wrote: > > > > --- a/src/backend/replication/logical/origin.c > > +++ b/src/backend/replication/logical/origin.c > > @@ -428,6 +428,7 @@ replorigin_drop_by_name(const char *name, bool > > missing_ok, bool nowait) > > * the specific origin and then re-check if the origin still > > exists. > > */ > > rel = table_open(ReplicationOriginRelationId, ExclusiveLock); > > +pg_usleep(300000); > > > > Not reproduced on REL_16_STABLE (since f6c5edb8a), nor in v14- (because > > 024_add_drop_pub.pl was added in v15). > > > > [1] > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2025-07-01%2018%3A00%3A58 > > > > Best regards, > > Alexander > > > > Hi Alexander, > > Yes, the problem can be reproduced by the changes you suggested. I > will look into what is happening and how we can fix this.
The issue appears to be a deadlock caused by inconsistent lock acquisition order between two processes: Process A (executing ALTER SUBSCRIPTION tap_sub DROP PUBLICATION tap_pub_1): In AlterSubscription_refresh(), it first acquires an AccessExclusiveLock on SubscriptionRelRelationId (resource 1), then later tries to acquire an ExclusiveLock on ReplicationOriginRelationId (resource 2). Process B (apply worker): In process_syncing_tables_for_apply(), it first acquires an ExclusiveLock on ReplicationOriginRelationId (resource 2), then calls UpdateSubscriptionRelState(), which tries to acquire a AccessShareLock on SubscriptionRelRelationId (resource 1). This leads to a deadlock: Process A holds a lock on resource 1 and waits for resource 2, while process B holds a lock on resource 2 and waits for resource 1. Proposed fix: In process_syncing_tables_for_apply(), acquire an AccessExclusiveLock on SubscriptionRelRelationId before acquiring the lock on ReplicationOriginRelationId. Patch with fix attached. I'll continue investigating whether this issue also affects HEAD. regards, Ajin Cherian Fujitsu Australia.
0001-Fix-a-deadlock-during-ALTER-SUBSCRIPTION-.-DROP-PUBL.patch
Description: Binary data