Hi,

We encountered a problem while upgrading PostgreSQL to 18 and created
a new check to pg_upgrade to detect objects with invalid role
references in pg_init_privs.

Problem: When a role is deleted after granting privileges but before
initial privileges are recorded, pg_upgrade fails during restore.

pg_upgrade --check passes all consistency checks but then fails during
the actual upgrade when pg_restore attempts to GRANT privileges to a
role OID (e.g. "16424") that no longer exists. The root cause is
orphan records in pg_init_privs whose ACL entries reference a dropped
role by OID. pg_upgrade does not detect these dangling references
during its pre-flight checks.

To reproduce the issue apply the following steps.  In my case, I was
working with the pg_wait_sampling extension.


Create a database with owner and create the extension of pg_wait_sampling

```
-- Create test role and database
CREATE ROLE benchmark_owner SUPERUSER;
CREATE DATABASE my_db OWNER benchmark_owner;
\c my_db
SET ROLE benchmark_owner;
CREATE EXTENSION pg_wait_sampling;
RESET ROLE;

-- Verify pg_init_privs.

SELECT pip.objoid, pip.classoid, pip.privtype, pip.initprivs, e.extname
FROM pg_init_privs pip
JOIN pg_depend d ON d.objid = pip.objoid
JOIN pg_extension e ON e.oid = d.refobjid
WHERE e.extname = 'pg_wait_sampling'
AND pip.privtype = 'e';


postgres=# \c my_db
psql (13.23, server 14.23)
WARNING: psql major version 13, server major version 14.
Some psql features might not work.
You are now connected to database "my_db" as user "postgres".
my_db=# SET ROLE benchmark_owner;
CREATE EXTENSION pg_wait_sampling;
RESET ROLE;
SET
ERROR: permission denied to create extension "pg_wait_sampling"
HINT: Must be superuser to create this extension.
RESET
my_db=# alter role benchmark_owner superuser ;
ALTER ROLE
my_db=# SET ROLE benchmark_owner;
CREATE EXTENSION pg_wait_sampling;
RESET ROLE;
SET
CREATE EXTENSION
RESET
my_db=# SELECT pip.objoid, pip.classoid, pip.privtype, pip.initprivs, e.extname
FROM pg_init_privs pip
JOIN pg_depend d ON d.objid = pip.objoid
JOIN pg_extension e ON e.oid = d.refobjid
WHERE e.extname = 'pg_wait_sampling'
AND pip.privtype = 'e';
objoid | classoid | privtype | initprivs | extname
--------+----------+----------+--------------------------------------------------------------+------------------
16429 | 1259 | e |
{benchmark_owner=arwdDxt/benchmark_owner,=r/benchmark_owner} |
pg_wait_sampling
16434 | 1259 | e |
{benchmark_owner=arwdDxt/benchmark_owner,=r/benchmark_owner} |
pg_wait_sampling
16439 | 1259 | e |
{benchmark_owner=arwdDxt/benchmark_owner,=r/benchmark_owner} |
pg_wait_sampling
16443 | 1255 | e | {benchmark_owner=X/benchmark_owner} | pg_wait_sampling
(4 rows)
```

Rename the database, re-assign the role, and drop the role.

```
/c postgres
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'my_db';ALTER DATABASE my_db RENAME TO my_db_v2;
\c my_db_v2
REASSIGN OWNED BY benchmark_owner TO postgres;
DROP OWNED BY benchmark_owner;
\c postgres
DROP ROLE benchmark_owner;
```

Verify the dangling/orphan records.
```
-- Verify the dangling/orphan records on pg_init_privs

-- Should show numeric OID instead of role name
SELECT * FROM pg_init_privs WHERE privtype = 'e';

-- Should return rows (dangling references)
SELECT pip.objoid
FROM pg_init_privs pip
CROSS JOIN LATERAL aclexplode(pip.initprivs) ace
LEFT JOIN pg_authid a ON a.oid = ace.grantee
WHERE a.oid IS NULL
  AND ace.grantee <> 0;
```

See the pg_upgrade error

```
/usr/pgsql-18/bin/pg_upgrade --old-datadir=/var/lib/pgsql/14/data
--new-datadir=/var/lib/pgsql/18/data --old-bindir=/usr/pgsql-14/bin/
--new-bindir=/usr/pgsql-18/bin/ --check --jobs=4


[postgres@localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data
--new-datadir=/var/lib/pgsql/18/data --old-bindir=/usr/pgsql-14/bin/
--new-bindir=/usr/pgsql-18/bin/ --check --jobs=4
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for contrib/isn with bigint-passing mismatch         ok
Checking data type usage                                      ok
Checking for not-null constraint inconsistencies              ok
Checking for presence of required libraries                   ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for new cluster tablespace directories               ok

*Clusters are compatible*


[postgres@localhost data]$ /usr/pgsql-18/bin/pg_upgrade
--old-datadir=/var/lib/pgsql/14/data
--new-datadir=/var/lib/pgsql/18/data --old-bindir=/usr/pgsql-14/bin/
--new-bindir=/usr/pgsql-18/bin/ --jobs=4
Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for contrib/isn with bigint-passing mismatch         ok
Checking data type usage                                      ok
Checking for not-null constraint inconsistencies              ok
Creating dump of global objects                               ok
Creating dump of database schemas
                                                              ok
Checking for presence of required libraries                   ok
Checking database user is the install user                    ok
Checking for prepared transactions                            ok
Checking for new cluster tablespace directories               ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Setting locale and encoding for new cluster                   ok
Analyzing all rows in the new cluster                         ok
Freezing all rows in the new cluster                          ok
Deleting files from new pg_xact                               ok
Copying old pg_xact to new server                             ok
Setting oldest XID for new cluster                            ok
Setting next transaction ID and epoch for new cluster         ok
Deleting files from new pg_multixact/offsets                  ok
Copying old pg_multixact/offsets to new server                ok
Deleting files from new pg_multixact/members                  ok
Copying old pg_multixact/members to new server                ok
Setting next multixact ID and offset for new cluster          ok
Resetting WAL archives                                        ok
Setting the default char signedness for new cluster           ok
Setting frozenxid and minmxid counters in new cluster         ok
Restoring global objects in the new cluster                   ok
Restoring database schemas in the new cluster
  my_db_v2
*failure*

Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
for
the probable cause of the failure.
Failure, exiting

*failure*

Consult the last few lines of
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16384.log"
for
the probable cause of the failure.
Failure, exiting

child process exited abnormally: status 256
Failure, exiting
[postgres@localhost data]$
```

See the pg_upgrade logs

```
[postgres@localhost data]$ cat
/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log
command: "/usr/pgsql-18/bin/pg_dump" --host /var/lib/pgsql/18/data
--port 50432 --username postgres --no-data --sequence-data
--quote-all-identifiers --binary-upgrade --format=custom --statistics
--no-sync 
--file="/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
'dbname=my_db_v2' >>
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1


command: "/usr/pgsql-18/bin/pg_restore" --host /var/lib/pgsql/18/data
--port 50432 --username postgres --create --exit-on-error --verbose
--transaction-size=250 --dbname template1
"/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/dump/pg_upgrade_dump_16425.custom"
>> "/var/lib/pgsql/18/data/pg_upgrade_output.d/20260518T065340.119/log/pg_upgrade_dump_16425.log"
2>&1
pg_restore: connecting to database for restore
pg_restore: executing SELECT pg_catalog.set_config('search_path', '', false);
pg_restore: creating DATABASE "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '', false);
pg_restore: creating DATABASE PROPERTIES "my_db_v2"
pg_restore: connecting to new database "my_db_v2"
pg_restore: executing SELECT pg_catalog.set_config('search_path', '', false);
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating SCHEMA "public"
pg_restore: creating COMMENT "SCHEMA "public""
pg_restore: creating EXTENSION "pg_wait_sampling"
pg_restore: creating COMMENT "EXTENSION "pg_wait_sampling""
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_current(integer)"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_history()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_get_profile()"
pg_restore: creating FUNCTION "public.pg_wait_sampling_reset_profile()"
pg_restore: creating VIEW "public.pg_wait_sampling_current"
pg_restore: creating VIEW "public.pg_wait_sampling_history"
pg_restore: creating VIEW "public.pg_wait_sampling_profile"
pg_restore: creating ACL "SCHEMA "public""
pg_restore: creating ACL "public.FUNCTION "pg_wait_sampling_reset_profile"()"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 4338; 0 0 ACL FUNCTION
"pg_wait_sampling_reset_profile"() postgres
pg_restore: error: could not execute query: ERROR:  role "16424" does not exist
Command was: SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM PUBLIC;
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"()
FROM "postgres";
SET SESSION AUTHORIZATION "16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO "16424";
RESET SESSION AUTHORIZATION;
SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);
REVOKE ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() FROM "16424";
GRANT ALL ON FUNCTION "public"."pg_wait_sampling_reset_profile"() TO "postgres";
```

Regards.

Attachment: v1-0001-add-pg-upgrade-check-for-invalid-init-privs.patch
Description: Binary data

Reply via email to