I wrote: > * We now recursively enter ScanPgRelation, which (again) needs to do a > search using pg_class_oid_index, so it (again) opens and locks that. > BUT: LockRelationOid sees that *this process already has share lock on > pg_class_oid_index*, so it figures it can skip AcceptInvalidationMessages.
BTW, I now have a theory for why we suddenly started seeing this problem in mid-June: commits a54e1f158 et al added a ScanPgRelation call where there had been none before (in RelationReloadNailed, for non-index rels). That didn't create the problem, but it probably increased the odds of seeing it happen. Also ... isn't the last "relation->rd_isvalid = true" in RelationReloadNailed wrong? If it got cleared during ScanPgRelation, I do not think we want to believe that we got an up-to-date row. regards, tom lane