Hi, jian
On Sat, 17 Jan 2026 at 11:13, jian he <[email protected]> wrote: > On Fri, Jan 9, 2026 at 2:05 PM Japin Li <[email protected]> wrote: >> >> 2. >> + * Returns NIL if the role OID is invalid. This can happen if the role was >> + * dropped concurrently, or if we're passed a OID that doesn't match >> + * any role. >> >> However, when I tested concurrent DROP ROLE, the function can still return a >> non-NIL result (though incomplete). >> >> Here’s a reproducible scenario: >> >> a) Prepare >> -- Session 1 >> CREATE USER u01 WITH CONNECTION LIMIT 10; >> ALTER USER u01 IN DATABASE postgres SET work_mem TO '16MB'; >> SELECT pg_get_role_ddl_statements('u01'::regrole); >> >> b) Set a breakpoint in Session 1's backend using GDB at >> pg_get_role_ddl_internal. >> >> c) Execute the query in Session 1: >> --- Session 1 >> SELECT pg_get_role_ddl_statements('u01'::regrole); >> >> d) In GDB, step over the line: >> tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid)); >> >> e) In Session 2, drop the user: >> --- Session 2 >> DROP USER u01; >> >> f) Continue execution in GDB. >> >> Result in Session 1: >> >> postgres=# SELECT pg_get_role_ddl_statements('u01'::regrole); >> pg_get_role_ddl_statements >> >> ------------------------------------------------------------------------------------------------------------------ >> CREATE ROLE u01 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN >> NOREPLICATION NOBYPASSRLS CONNECTION LIMIT 10; >> (1 row) >> >> We only get the CREATE ROLE statement; the ALTER ROLE ... SET work_mem >> statement is missing. This behavior does not fully match the comment, which >> implies that an invalid OID would return NIL. In this case, we get a partial >> (and potentially misleading) result instead. >> > > + tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid)); > + if (!HeapTupleIsValid(tuple)) > + return NIL; > > after SearchSysCache1, HeapTupleIsValid, adding > > + LockSharedObject(AuthIdRelationId, roleid, 0, AccessShareLock); Thanks for pointing this out. > > can solve this problem. > We have a similar code pattern in DropRole. > I think this can handle most cases, but there is still a small window where a concurrent DROP USER could happen between SearchSysCache1() and LockSharedObject(). -- Regards, Japin Li ChengDu WenWu Information Technology Co., Ltd.
