On Sun, Jul 7, 2013 at 9:24 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 3 January 2012 18:42, Tom Lane <t...@sss.pgh.pa.us> wrote: >> I wrote: >>>> Another point that requires some thought is that switching SnapshotNow >>>> to be MVCC-based will presumably result in a noticeable increase in each >>>> backend's rate of wanting to acquire snapshots. >> >> BTW, I wonder if this couldn't be ameliorated by establishing some >> ground rules about how up-to-date a snapshot really needs to be. >> Arguably, it should be okay for successive SnapshotNow scans to use the >> same snapshot as long as we have not acquired a new lock in between. >> If not, reusing an old snap doesn't introduce any race condition that >> wasn't there already. > > Now that has been implemented using the above design, we can resubmit > the lock level reduction patch, with thanks to Robert. > > Submitted patch passes original complaint/benchmark. > > Changes > * various forms of ALTER TABLE, notably ADD constraint and VALIDATE > * CREATE TRIGGER > > One minor coirrections to earlier thinking with respect to toast > tables. That might be later relaxed. > > Full tests including proof of lock level reductions, plus docs.
I'm quite glad to see this patch back again for another round. I haven't reviewed it in detail yet, so the purpose of this email is just to lay out some general areas of concern and food for thought rather than to critique anything in the patch too specifically. Generally, the question on the table is: to what extent do MVCC catalog scans make the world safe for concurrent DDL, or put negatively, what hazards remain? Noah discovered an interesting one recently: apparently, the relcache machinery has some logic in it that depends on the use of AccessExclusiveLock in subtle ways. I'm going to attempt to explain it, and maybe he can jump in and explain it better. Essentially, the problem is that when a relcache reload occurs, certain data structures (like the tuple descriptor, but there are others) are compared against the old version of the same data structure. If there are no changes, we do nothing; else, we free the old one and install the new one. The reason why we don't free the old one and install the new one unconditionally is because other parts of the backend might have pointers to the old data structure, so just replacing it all the time would result in crashes. Since DDL requires AccessExclusiveLock + CheckTableNotInUse(), any actual change to the data structure will happen when we haven't got any pointers anyway. A second thing I'm concerned about is that, although MVCC catalog scans guarantee that we won't miss a concurrently-updated row entirely, or see a duplicate, they don't guarantee that the rows we see during a scan of catalog A will be consistent with the rows we see during a scan of catalog B moments later. For example, hilarity will ensue if relnatts doesn't match what we see in pg_attribute. Now I don't think this particular example matters very much because I think there are probably lots of other things that would also break if we try to add a column without a full table lock, so we're probably doomed there anyway. But there might be other instances of this problem that are more subtle. I'll try to find some time to look at this in more detail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers