On Thu, Nov 2, 2017 at 4:20 AM, Andres Freund <and...@anarazel.de> wrote: > I think the problem is on the pruning, rather than the freezing side. We > can't freeze a tuple if it has an alive predecessor - rather than > weakining this, we should be fixing the pruning to not have the alive > predecessor.
Excellent catch. > If the update xmin is actually below the cutoff we can remove the tuple > even if there's live lockers - the lockers will also be present in the > newer version of the tuple. I verified that for me that fixes the > problem. Obviously that'd require some comment work and more careful > diagnosis. I didn't even know that that was safe. > I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in > the face of previously corrupted tuple chains and pg_upgraded clusters - > it can lead to tuples being considered related, even though they they're > from entirely independent hot chains. Especially when upgrading 9.3 post > your fix, to current releases. Frankly, I'm relieved that you got to this. I was highly suspicious of a5736bf754c82d8b86674e199e232096c679201d, even beyond my specific, actionable concern about how it failed to handle the 9.3/FrozenTransactionId xmin case as special. As I went into in the "heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead bug" thread, these commits left us with a situation where there didn't seem to be a reliable way of knowing whether or not it is safe to interrogate clog for a given heap tuple using a tool like amcheck. And, it wasn't obvious that you couldn't have a codepath that failed to account for pre-cutoff non-frozen tuples -- codepaths that call TransactionIdDidCommit() despite it actually being unsafe. If I'm not mistaken, your proposed fix restores sanity there. -- Peter Geoghegan -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers