Andres Freund <[email protected]> wrote: > > Attached here is what I consider a possible fix - simply wait for the CLOG > > update before building a new snapshot. > > I don't think that's enough - during non-timetravel visibility semantics, you > can only look at the clog if the transaction isn't marked as in-progress in > the procarray. ISTM that we need to do that here too?
I understand that CLOG must be up-to-date by the time the snapshot is used for visibility checks, but I think that - from the snapshot user POV - what matters is "snapshot->xip vs CLOG" rather than "procarray vs CLOG". For procarray-based snapshots, this consistency is ensured by 1) not removing the XID from procarray until the status is set in CLOG and 2) getting the list of running transactions from procarray. Thus if an MVCC snapshot does not have particular XID in its "xip" array, it implies that it's no longer in procarray and therefore it's been marked in CLOG. As for logical decoding based snapshots (whether HISTORIC_MVCC or those converted eventually to regular MVCC), we currently do not check if CLOG is consistent with the transaction list in snapshot->xip. What I proposed is that we enforce this consistency by checking CLOG (and possibly waiting) before we finalize the snapshot. Thus the snapshot user can safely assume that the snapshot->xip array is consistent with CLOG, as if the snapshot was based on procarray. Or is there another issue with the CLOG itself? I thought about wraparound (i.e. getting the XID status from a CLOG slot which is still being used by old transactions) but I wouldn't expect that (AFAICS, CLOG truncation takes place during XID freezing). Concurrent access to the slot should neither be a problem since only a single byte (which is atomic) needs to be fetched during the XID status check. Another hypothetical problem that occurs to me is memory access ordering, i.e. one backend creates and exports the snapshot and another one imports it before it can see the CLOG update. It's hard to imagine though. Or are there other concerns? -- Antonin Houska Web: https://www.cybertec-postgresql.com
