Hi, jian

On Sat, 17 Jan 2026 at 11:13, jian he <[email protected]> wrote:
> On Fri, Jan 9, 2026 at 2:05 PM Japin Li <[email protected]> wrote:
>>
>> 2.
>> + * Returns NIL if the role OID is invalid.  This can happen if the role was
>> + * dropped concurrently, or if we're passed a OID that doesn't match
>> + * any role.
>>
>> However, when I tested concurrent DROP ROLE, the function can still return a
>> non-NIL result (though incomplete).
>>
>> Here’s a reproducible scenario:
>>
>> a) Prepare
>>    -- Session 1
>>    CREATE USER u01 WITH CONNECTION LIMIT 10;
>>    ALTER USER u01 IN DATABASE postgres SET work_mem TO '16MB';
>>    SELECT pg_get_role_ddl_statements('u01'::regrole);
>>
>> b) Set a breakpoint in Session 1's backend using GDB at 
>> pg_get_role_ddl_internal.
>>
>> c) Execute the query in Session 1:
>>    --- Session 1
>>    SELECT pg_get_role_ddl_statements('u01'::regrole);
>>
>> d) In GDB, step over the line:
>>    tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
>>
>> e) In Session 2, drop the user:
>>    --- Session 2
>>    DROP USER u01;
>>
>> f) Continue execution in GDB.
>>
>> Result in Session 1:
>>
>>     postgres=# SELECT pg_get_role_ddl_statements('u01'::regrole);
>>                                                 pg_get_role_ddl_statements
>>     
>> ------------------------------------------------------------------------------------------------------------------
>>      CREATE ROLE u01 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN 
>> NOREPLICATION NOBYPASSRLS CONNECTION LIMIT 10;
>>     (1 row)
>>
>> We only get the CREATE ROLE statement; the ALTER ROLE ... SET work_mem
>> statement is missing. This behavior does not fully match the comment, which
>> implies that an invalid OID would return NIL. In this case, we get a partial
>> (and potentially misleading) result instead.
>>
>
> + tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
> + if (!HeapTupleIsValid(tuple))
> + return NIL;
>
> after SearchSysCache1, HeapTupleIsValid, adding
>
> + LockSharedObject(AuthIdRelationId, roleid, 0, AccessShareLock);

Thanks for pointing this out.
>
> can solve this problem.
> We have a similar code pattern in DropRole.
>

I think this can handle most cases, but there is still a small window where a
concurrent DROP USER could happen between SearchSysCache1() and 
LockSharedObject().


-- 
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.


Reply via email to