On Fri, Jun 17, 2011 at 2:44 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Hmm, yeah, I think this idea is probably better than mine, just because > of the less dubious semantics. I don't see how you'd make it work for > generic MVCC scans, because the behavior will be "the database state as > of some hard-to-predict time after the scan starts", which is not what > we want for MVCC. But it ought to be fine for SnapshotNow.
Department of second thoughts: I think I see a problem. Suppose we have a tuple that has not been updated for a long time. Its XMIN is committed and all-visible, and its XMAX is invalid. As we're scanning the table (transaction T1), we see that tuple and say, oh, it's visible. Now, another transaction (T2) begins, updates the tuple, and commits. Our scan then reaches the page where the new tuple is located, and says, oh, this is recent, I'd better take a real snapshot. Of course, the new snapshot can see the new version of the tuple, too. Of course, if T1 had taken its snapshot before starting the scan, the second tuple would have been invisible. But since we didn't take it until later, after T2 had already committed, we see a duplicate. That's still no worse than your idea, which rests on the theory that duplicates don't matter anyway, but the case for it being better is a lot thinner. I'd sure prefer something that had less crazy semantics than either of these ideas, if we can think of something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers