On Tue, May 31, 2016 at 10:03 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, May 27, 2016 at 10:58 AM, Kevin Grittner <kgri...@gmail.com> wrote: >>> As far as I can see normal index builds will allow concurrent hot >>> prunes and everything; since those only require page-level >>> exclusive locks. >>> >>> So for !concurrent builds we might end up with a corrupt index. >> >> ... by which you mean an index that omits certainly-dead heap >> tuples which have been the subject of early pruning or vacuum, even >> if there are still registered snapshots that include those tuples? >> Or do you see something else? > > I think that is the danger.
Well, the *perceived* danger -- since every page in the new index would be new enough to be recognized as too recently modified to be safe for a snapshot that could still see the omitted rows, the only question I'm sorting out on this is how safe it might be to cause the index to be ignored in planning when using such a snapshot. That and demonstrating the safe behavior to those not following closely enough to see what will happen without a demonstration. >> Again, since both the heap pages involved and all the new index >> pages would have LSNs recent enough to trigger the old snapshot >> check, I'm not sure where the problem is, > This is a good point, though, I think. The whole perception of risk in this area seems to be based on getting that wrong; although the review of this area may yield some benefit in terms of minimizing false positives. >>> I think many of the places relying on heap scans with !mvcc >>> snapshots are problematic in that way. Outdated reads will not be >>> detected by TestForOldSnapshot() (given the (snapshot)->satisfies >>> == HeapTupleSatisfiesMVCC condition therein), and rely on the >>> snapshot to be actually working. E.g. CLUSTER/ VACUUM FULL rely >>> on accurate HEAPTUPLE_RECENTLY_DEAD >> >> Don't the "RECENTLY" values imply that the transaction is still >> running which cause the tuple to be dead? Since tuples are not >> subject to early pruning or vacuum when that is the case, where do >> you see a problem? The snapshot itself has the usual xmin et al., >> so I'm not sure what you might mean by "the snapshot to be actually >> working" if not the override at the time of pruning/vacuuming. > > Anybody who calls HeapTupleSatisfiesVacuum() with an xmin value that > is newer that the oldest registered snapshot in the system (based on > some snapshots being ignored) might get a return value of > HEAPTUPLE_DEAD rather than HEAPTUPLE_RECENTLY_DEAD. Since the override xmin cannot advance past the earliest transaction which is still active, HEAPTUPLE_DEAD indicates that the transaction causing the tuple to be dead has completed and the tuple is irrevocably dead -- even if there are still snapshots registered which can see it. Seeing HEAPTUPLE_DEAD rather than HEAPTUPLE_RECENTLY_DEAD is strictly limited to tuples which are certainly and permanently dead and for which the only possible references are non-MVCC snapshots or existing snapshots subject to "snapshot too old" monitoring. > IndexBuildHeapRangeScan(): We might end up with indexIt = false > instead of indexIt = true. That should be OK because anyone using the > old snapshot will see a new page LSN and error out. We might also > fail to set indexInfo->ii_BrokenHotChain = true. I suspect that's a > problem, but I'm not certain of it. The latter flag is what I'm currently digging at; but my hope is that whenever old_snapshot_threshold >= 0 we can set indexInfo->ii_BrokenHotChain = true to cause the planner to skip consideration of the index if the snapshot is an "old" one. That will avoid some false positives (seeing the error when not strictly necessary). If that works out the way I hope, the only down side is that a scan using a snapshot from an old transaction or cursor would use some other index or a heap scan; but we already have that possibility in some cases -- that seems to be the point of the flag. > CheckForSerializableConflictOut: Maybe a problem? If the tuple is > dead, there's no issue, but if it's recently-dead, there might be. If the tuple is not visible to the scan, the behavior is unchanged (a simple return from the function on either HEAPTUPLE_DEAD or HEAPTUPLE_RECENTLY_DEAD with !visible) and (thus) clearly correct. If the tuple is visible to us it is currently subject to early pruning or vacuum (since those operations would get the same modified xmin) but has not yet had any such treatment since we made it to this function in the first place. The processing for SSI purposes would be unaffected by the possibility that there could later be early pruning/vacuuming. Thanks for the review and feedback! -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers