Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Vacuum's always had a race condition: it makes a list of rel OIDs and >> then tries to vacuum each one. It narrows the window for failure by >> doing a SearchSysCacheExists test before relation_open, but there's >> still a window for failure.
> hmm yeah - missed the VACUUM; part of the regression diff. > Still this means we will have to live with (rare) failures once in a > while during that test ? I thought of what seems a pretty simple solution for this: make VACUUM lock the relation before doing the SearchSysCacheExists, ie instead of the existing code if (!SearchSysCacheExists(RELOID, ObjectIdGetDatum(relid), 0, 0, 0)) // give up lmode = vacstmt->full ? AccessExclusiveLock : ShareUpdateExclusiveLock; onerel = relation_open(relid, lmode); do lmode = vacstmt->full ? AccessExclusiveLock : ShareUpdateExclusiveLock; LockRelationOid(relid, lmode); if (!SearchSysCacheExists(RELOID, ObjectIdGetDatum(relid), 0, 0, 0)) // give up onerel = relation_open(relid, NoLock); Once we're holding lock, we can be sure there's not a DROP TABLE in progress, so there's no race condition anymore. It's OK to take a lock on the OID of a relation that no longer exists, AFAICS; we'll just drop it again immediately (the "give up" path includes transaction exit, so there's not even any extra code needed). This wasn't possible before the recent adjustments to the relation locking protocol, but now it looks trivial ... am I missing anything? Perhaps it is worth folding this test into a "conditional_relation_open" function that returns NULL instead of failing if the rel no longer exists. I think there are potential uses in CLUSTER and perhaps REINDEX as well as VACUUM. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly