Robert Haas <robertmh...@gmail.com> writes: > On Fri, Jun 17, 2011 at 1:22 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Yeah. After mulling it for awhile, what about this idea: we could >> redefine SnapshotNow as a snapshot type that includes a list of >> transactions-in-progress, somewhat like an MVCC snapshot, but we don't >> fill that list from the PGPROC array. Instead, while running a scan >> with SnapshotNow, anytime we determine that a particular XID is >> still-in-progress, we add that XID to the snapshot's list.
> I think that something like that might possibly work, but what if the > XID array overflows? Well, you repalloc it bigger. In either this idea or yours below, I fear SnapshotNow snaps will have to become dynamically-allocated structures instead of being simple references to a shared constant object. (This is because we can sometimes do a SnapshotNow scan when another one is already in progress, and we couldn't let the inner one change the outer one's state.) That's not really a performance problem; one more palloc to do a catalog scan is a non-issue. But it is likely to be a large notational change compared to what we've got now. > A while back I proposed the idea of a "lazy" snapshot, by which I had > in mind something similar to what you are suggesting but different in > detail. Initially, when asked to acquire a snapshot, the snapshot > manager acknowledges having taken one but does not actually do any > work. As long as it sees only XIDs that either precede the oldest XID > still running anywhere in the cluster, or have aborted, it can provide > answers that are 100% correct without any further data. If it ever > sees a newer, non-aborted XID then it goes and really gets an MVCC > snapshot at that point, which it can uses from that point onward. I > think that it might be possible to make such a system work even for > MVCC snapshots generally, but even if not, it might be sufficient for > this purpose. Unlike your approach, it would avoid both the "see no > rows" and the "see multiple rows" cases, which might be thought an > advantage. Hmm, yeah, I think this idea is probably better than mine, just because of the less dubious semantics. I don't see how you'd make it work for generic MVCC scans, because the behavior will be "the database state as of some hard-to-predict time after the scan starts", which is not what we want for MVCC. But it ought to be fine for SnapshotNow. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers