On Fri, Jul 3, 2026 at 9:09 PM Bertrand Drouvot <[email protected]> wrote: > > Hi, > > On Fri, Jul 03, 2026 at 03:45:34PM +0530, Amit Kapila wrote: > > On Fri, Jul 3, 2026 at 1:38 PM Zhijie Hou (Fujitsu) > > <[email protected]> wrote: > > > > > > On Friday, July 3, 2026 1:53 PM Bertrand Drouvot > > > <[email protected]> wrote: > > > > > > > > > but given the patch's simplicity, I recommend backpatching. > > > > > > > > That's right but that would only improve error messages. That said, > > > > looking > > > > closer, they are elog() ones, so "not expected" to occur so yeah > > > > backpatch > > > > does make sense. > > > > > > +1 for backpatching, even if it's rare, the "ERROR: tuple concurrently > > > updated" > > > message seems confusing to me. > > > > > > > I also think backpatching makes sense. BTW, I have a comment: > > Thanks for looking at it! > > > + heap_freetuple(tup); > > + tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, > > ObjectIdGetDatum(MyDatabaseId), > > + CStringGetDatum(stmt->subname)); > > > > heap_freetuple() could be done before acquiring the lock, is there a > > reason to keep it after lock? > > No particular reason, could be done before. Done in 0001 attached. > > > > > > > > > > > That said, what about also fixing DropSubscription() like in the 0002 > > > > attached? > > > > (that would also produce those elog() messages in case of concurrent > > > > DROP or > > > > ALTER). > > > > > > For the patch, I'm not sure if we must repeat the checks twice. Could we > > > simply move the original checks to after we take the lock? At least, the > > > GetSubscription() call and the password check can be moved there and old > > > codes > > > can be deleted. > > > > > > > Isn't the same true for the AlterSubscription() case as well? > > I think there is no need to lock if we are later going to disallow changing > the > subscription data due to the password_required/superuser check. > > That said moving it as suggested by Hou-san, does simplify the code and the > lock > is not held for long, so done that way in 0001. > > > Also, I > > noticed that AlterPublication() does the same trick but it uses > > PUBLICATIONOID cacheid, so shouldn't we use SUBSCRIPTIONOID cacheid > > here as well? I think this is to prevent the case where the same name > > pub/sub is recreated after lock. > > Oh right and I did it that way in 0001 and 0002. > > But while doing this and looking closely, I'm not sure AlterPublication() does > it right. Indeed, in theory, the OID could have been re-used too (between the > time we did the name resolution and the time we lock the publication). I think > what is needed is something similar to RangeVarGetRelidExtended(), means do > the > name resolution, acl check (ownership) and lock acquisition, all in unison. > > That's what 0003 is trying to achieve for the subscription and 0004 for the > publication. > > What do you think? > 0003:
It looks like the implementation of DROP SUBSCRIPTION IF EXISTS has a concurrent drop race condition in DropSubscription(). Currently, if stmt->missing_ok is true, the initial lookup safely handles a missing subscription. However, once a subscription is found and the code enters the drop loop, a second internal lookup/refetch happens. If a concurrent transaction drops the subscription after our initial check but before this internal refetch, the code throws an error. Essentially, the loop completely ignores the missing_ok flag during the refetch phase. Am I missing something? -- Regards, Dilip Kumar Google
