On Mon, May 1, 2017 at 2:10 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> Actually, I guess amcheck would need to use its own scan's snapshot
> xmin instead. This is true because it cares about visibility in a way
> that's "backwards" relative to existing code that tests something
> against RecentGlobalXmin. Is there any existing thing that works that
> way?

Looks like pg_visibility has a similar set of concerns, and so
sometimes calls GetOldestXmin() to "recompute" what it calls
OldestXmin (which I gather is like RecentGlobalXmin, but comes from
calling GetOldestXmin() at least once). This happens within
pg_visibility's collect_corrupt_items(). So, I could either follow
that approach, or, more conservatively, call GetOldestXmin()
immediately after each "amcheck whole index scan" finishes, for use
later on, when we go to the heap. Within the heap, we expect that any
committed tuple whose xmin precedes FooIndex.OldestXmin should be
present in that index's bloom filter. Of course, when there are
multiple indexes, we might only arrive at the heap much later. (I
guess we'd also want to check if the MVCC Snapshot's xmin preceded
FooIndex.OldestXmin, and set that as FooIndex.OldestXmin when that
happened to be the case.)

Anyone have an opinion on any of this? Offhand, I think that calling
GetOldestXmin() once per index when its "amcheck whole index scan"
finishes would be safe, and yet provide appreciably better test
coverage than only expecting things visible to our original MVCC
snapshot to be present in the index. I don't see a great reason to be
more aggressive and call GetOldestXmin() more often than once per
whole index scan, though.

Peter Geoghegan

VMware vCenter Server

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to