On Fri, May 27, 2016 at 10:58 AM, Kevin Grittner <kgri...@gmail.com> wrote: >> As far as I can see normal index builds will allow concurrent hot >> prunes and everything; since those only require page-level >> exclusive locks. >> >> So for !concurrent builds we might end up with a corrupt index. > > ... by which you mean an index that omits certainly-dead heap > tuples which have been the subject of early pruning or vacuum, even > if there are still registered snapshots that include those tuples? > Or do you see something else?
I think that is the danger. > Again, since both the heap pages involved and all the new index > pages would have LSNs recent enough to trigger the old snapshot > check, I'm not sure where the problem is, but will review closely > to see what I might have missed. This is a good point, though, I think. >> I think many of the places relying on heap scans with !mvcc >> snapshots are problematic in that way. Outdated reads will not be >> detected by TestForOldSnapshot() (given the (snapshot)->satisfies >> == HeapTupleSatisfiesMVCC condition therein), and rely on the >> snapshot to be actually working. E.g. CLUSTER/ VACUUM FULL rely >> on accurate HEAPTUPLE_RECENTLY_DEAD > > Don't the "RECENTLY" values imply that the transaction is still > running which cause the tuple to be dead? Since tuples are not > subject to early pruning or vacuum when that is the case, where do > you see a problem? The snapshot itself has the usual xmin et al., > so I'm not sure what you might mean by "the snapshot to be actually > working" if not the override at the time of pruning/vacuuming. Anybody who calls HeapTupleSatisfiesVacuum() with an xmin value that is newer that the oldest registered snapshot in the system (based on some snapshots being ignored) might get a return value of HEAPTUPLE_DEAD rather than HEAPTUPLE_RECENTLY_DEAD. It seems necessary to carefully audit all calls to HeapTupleSatisfiesVacuum() to see whether that difference matters. I took a quick look and here's what I see: statapprox_heap(): Statistical information for the DBA. The difference is non-critical. heap_prune_chain(): Seeing the tuple as dead might cause it to be removed early. This should be OK. Removing the tuple early will cause the page LSN to be bumped unless RelationNeedsWAL() returns false, and TransactionIdLimitedForOldSnapshots() includes that as a condition for disabling early pruning. IndexBuildHeapRangeScan(): We might end up with indexIt = false instead of indexIt = true. That should be OK because anyone using the old snapshot will see a new page LSN and error out. We might also fail to set indexInfo->ii_BrokenHotChain = true. I suspect that's a problem, but I'm not certain of it. acquire_sample_rows: Both return values are treated in the same way. No problem. copy_heap_data: We'll end up setting isdead = true instead of tups_recently_dead += 1. That means that the new heap won't include the tuple, which is OK because old snapshots can't read the new heap without erroring out, assuming that the new heap has LSNs. The xmin used here comes from vacuum_set_xid_limits() which goes through TransactionIdLimitedForOldSnapshots() so this should be OK for the same reasons as heap_prune_chain(). Another effect of seeing the tuple as prematurely dead is that we'll call rewrite_heap_dead_tuple() on it; rewrite_heap_dead_tuple() will presume that if this tuple is dead, its predecessor in the ctid chain is also dead. I don't see any obvious problem with that. lazy_scan_heap(): Basically, the same thing as heap_prune_chain(). CheckForSerializableConflictOut: Maybe a problem? If the tuple is dead, there's no issue, but if it's recently-dead, there might be. We might want to add comments to some of these places addressing snapshot_too_old specifically. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers