Running the regression tests with the patch I showed in https://www.postgresql.org/message-id/16565.1538327...@sss.pgh.pa.us exposes two places where HEAD is opening relations without having any lock at all on them:
1. ALTER TABLE ... ALTER COLUMN TYPE, on a column that is part of a foreign key constraint, opens the rel that has the other end of the constraint before it's acquired a lock on said rel. The comment in ATPostAlterTypeCleanup claims this is "safe because the parser won't actually look at the catalogs to detect the existing entry", but I think that's largely horsepucky. The parser absolutely does do relation_open, and it expects the caller to have gotten a lock sufficient to protect that (cf transformAlterTableStmt). It's possible that this doesn't have any real effect. Since we're already holding AccessExclusiveLock on our own end of the FK constraint, it'd be impossible for another session to drop the FK constraint, or by extension the other table. Still, running a relcache load on a table we have no lock on seems pretty unsafe, especially so in branches before we used MVCC for catalog reads. So I'm inclined to apply the attached patch all the way back. (The mentioned comment also needs rewritten; this is just the minimal code change to get rid of the test failure.) 2. pageinspect's tuple_data_split_internal supposes that it needs no lock on the referenced table. Perhaps there was an expectation that some earlier function would have taken a lock and not released it, but this is demonstrably not happening in the module's own regression test. I think we should just take AccessShareLock there and not try to be cute. Again, this seems to be back-patch material. Thoughts, objections? regards, tom lane
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 9e60bb3..9e7ae73 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -9905,6 +9905,7 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode) Form_pg_constraint con; Oid relid; Oid confrelid; + char contype; bool conislocal; tup = SearchSysCache1(CONSTROID, ObjectIdGetDatum(oldId)); @@ -9921,6 +9922,7 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode) elog(ERROR, "could not identify relation associated with constraint %u", oldId); } confrelid = con->confrelid; + contype = con->contype; conislocal = con->conislocal; ReleaseSysCache(tup); @@ -9936,6 +9938,15 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode) if (!conislocal) continue; + /* + * When rebuilding an FK constraint that references the table we're + * modifying, we might not yet have any lock on the FK's table, so get + * one now. We'll need AccessExclusiveLock for the DROP CONSTRAINT + * step, so there's no value in asking for anything weaker. + */ + if (relid != tab->relid && contype == CONSTRAINT_FOREIGN) + LockRelationOid(relid, AccessExclusiveLock); + ATPostAlterTypeParse(oldId, relid, confrelid, (char *) lfirst(def_item), wqueue, lockmode, tab->rewrite);
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c index 7438257..513db4b 100644 --- a/contrib/pageinspect/heapfuncs.c +++ b/contrib/pageinspect/heapfuncs.c @@ -298,9 +298,8 @@ tuple_data_split_internal(Oid relid, char *tupdata, TupleDesc tupdesc; /* Get tuple descriptor from relation OID */ - rel = relation_open(relid, NoLock); - tupdesc = CreateTupleDescCopyConstr(rel->rd_att); - relation_close(rel, NoLock); + rel = relation_open(relid, AccessShareLock); + tupdesc = RelationGetDescr(rel); raw_attrs = initArrayResult(BYTEAOID, CurrentMemoryContext, false); nattrs = tupdesc->natts; @@ -317,7 +316,6 @@ tuple_data_split_internal(Oid relid, char *tupdata, bytea *attr_data = NULL; attr = TupleDescAttr(tupdesc, i); - is_null = (t_infomask & HEAP_HASNULL) && att_isnull(i, t_bits); /* * Tuple header can specify less attributes than tuple descriptor as @@ -327,6 +325,8 @@ tuple_data_split_internal(Oid relid, char *tupdata, */ if (i >= (t_infomask2 & HEAP_NATTS_MASK)) is_null = true; + else + is_null = (t_infomask & HEAP_HASNULL) && att_isnull(i, t_bits); if (!is_null) { @@ -385,6 +385,7 @@ tuple_data_split_internal(Oid relid, char *tupdata, ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg("end of tuple reached without looking at all its data"))); + relation_close(rel, AccessShareLock); return makeArrayResult(raw_attrs, CurrentMemoryContext); }