In commit 4240e429d0c2d889d0cda23c618f94e12c13ade7, we modified RangeVarGetRelid() so that it acquires a lock on the target relation "atomically" with respect to the name lookup. Since we lock OIDs, not names, that's not possible, strictly speaking, but the idea is that we detect whether any invalidation messages were processed between the time we did the lookup and the time we acquired the relation lock. If so, we double-check that the name still refers to the same object, and if not, we release the lock on the object we had locked previously and instead lock the new one. This is a good thing, because it means that if you, for example, start a transaction, drop a table, create a new one in its place, and commit the transaction, people who were blocked waiting for the table lock will correctly latch onto the new table instead of erroring out all over the place. This is obviously advantageous in production situations where switching out tables in this way has historically been quite difficult to do without user-visible disruption.
The trouble with this mechanism, however, is that sometimes atomically looking up the relation and locking it is not what you want to do. For example, you might want to have a permission check in the middle, so that you don't even briefly lock a table on which the user has no permissions. Or, in the case of DROP INDEX, you might find it necessary to lock the heap before the index, as a result of our general rule that the heap locks should be acquired before index locks to reduce deadlocks. Right now, there's no good way to do this. Some code just opens the target relation with whatever lock is needed from the get-go, and we just hope the transaction will abort before it causes anyone else too much trouble. Other code looks up the relation OID without getting a lock, does its checks, and then gets the lock - but all of the places that take this approach uniformly lack the sort of retry loop that is present in RangeVarGetRelid() - and are therefore prone to latching onto a dropped relation, or maybe even checking permissions on the relation with OID 123 and then dropping the (eponymous) relation with OID 456. Although none of this seems like a huge problem in practice, it's awfully ugly, and it seems like it would be nice to clean it up. The trouble is, I'm not quite sure how to do that. It seems like permissions checks and lock-the-heap-for-this-index should be done in RangeVarGetRelid() just after the block that says "if (retry)" and just before the block that calls LockRelationOid(). That way, if we end up deciding we need to retry the name lookup, we'll retry all that other stuff as well, which is exactly right. The difficulty is that different callers have different needs for what should go in that space, to the degree that I'm a bit nervous about continuing to add arguments to that function to satisfy what everybody needs. Maybe we could make most of them Booleans and pass an "int flags" argument. Another option would be to add a "callback" argument to that function that would be called at that point with the relation, relId, and oldRelId as arguments. Alternatively, we could just resign ourselves to duplicating the loop in this function into each place in the code that has a special-purpose requirement, but the function is complex enough to make that a pretty unappealing prospect. Or we could just punt the whole thing and do nothing, but I don't like that either. Right now, for example, if user A is doing something with a table, and user B (who has no permissions) tries to use it, the pending request for an AccessExclusiveLock will block user C (who does have permissions) from doing anything until A's transaction commits or rolls back; only at that point do we realize that B has been naughty. I'd like to figure out some way to fix that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers