On Fri, Jun 17, 2011 at 11:41 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Thu, Jun 16, 2011 at 6:54 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> 4. Backend #2 visits the new, about-to-be-committed version of >>> pgbench_accounts' pg_class row just before backend #3 commits. >>> It sees the row as not good and keeps scanning. By the time it >>> reaches the previous version of the row, however, backend #3 >>> *has* committed. So that version isn't good according to SnapshotNow >>> either. > >> <thinks some more> > >> Why isn't this a danger for every pg_class update? For example, it >> would seem that if VACUUM updates relpages/reltuples, it would be >> prone to this same hazard. > > VACUUM does that with an in-place, nontransactional update. But yes, > this is a risk for every transactional catalog update.
Well, after various efforts to fix the problem, I notice that there are two distinct problems brought out by your test case. One of them is caused by my patch, one of them was already there in the code - this latter one is actually the hardest to fix. It took me about an hour to fix the first bug, but its taken a while of thinking about the second before I realised it was a pre-existing bug. The core problem is, as you observed that a pg_class update can cause rows to be lost with concurrent scans. We scan pg_class in two ways: to rebuild a relcache entry based on a relation's oid (easy fix). We also scan pg_class to resolve the name to oid mapping. The name to oid mapping is performed *without* a lock on the relation, since we don't know which relation to lock. So the name lookup can fail if we are in the middle of a pg_class update. This is an existing potential bug in Postgres unrelated to my patch. Ref: SearchCatCache() I've been looking at ways to lock the relation name and namespace prior to the lookup (or more precisely the hash), but its worth discussing whether we want that at all? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers