ISTM that heap_compute_xid_horizon_for_tuples() calculates latestRemovedXid for index deletion callers without sufficient care. The function only follows line pointer redirects, which is necessary but not sufficient to visit all relevant heap tuple headers -- it also needs to traverse HOT chains, but that doesn't happen. AFAICT heap_compute_xid_horizon_for_tuples() might therefore fail to produce a sufficiently recent latestRemovedXid value for the index deletion operation as a whole. This might in turn lead to the REDO routine (e.g. btree_xlog_delete()) doing conflict processing incorrectly during hot standby.
Attached is an instrumentation patch. If I run "make check" with the patch applied, I get test output failures that can be used to get a general sense of the problem: $ cat /code/postgresql/patch/build/src/test/regress/regression.diffs | grep "works okay this time" | wc -l 382 $ cat /code/postgresql/patch/build/src/test/regress/regression.diffs | grep "hot chain bug" +WARNING: hot chain bug, latestRemovedXid: 2307, latestRemovedXidWithHotChain: 2316 +WARNING: hot chain bug, latestRemovedXid: 4468, latestRemovedXidWithHotChain: 4538 +WARNING: hot chain bug, latestRemovedXid: 4756, latestRemovedXidWithHotChain: 4809 +WARNING: hot chain bug, latestRemovedXid: 5000, latestRemovedXidWithHotChain: 5001 +WARNING: hot chain bug, latestRemovedXid: 7683, latestRemovedXidWithHotChain: 7995 +WARNING: hot chain bug, latestRemovedXid: 13450, latestRemovedXidWithHotChain: 13453 +WARNING: hot chain bug, latestRemovedXid: 10040, latestRemovedXidWithHotChain: 10041 So out of 389 calls, we see 7 failures on this occasion, which is typical. Heap pruning usually saves us in practice (since it is highly correlated with setting LP_DEAD bits on index pages in the first place), and even when it doesn't it's not particularly likely that the issue will make the crucial difference for the deletion operation as a whole. The code that is now heap_compute_xid_horizon_for_tuples() ran in REDO routines directly prior to Postgres 12. heap_compute_xid_horizon_for_tuples() is a descendant of code added by Simon’s commit a760893d in 2010 -- pretty close to HOT’s initial introduction. So this has been around for a long time. -- Peter Geoghegan
0001-Instrument-heap_compute_xid_horizon_for_tuples.patch
Description: Binary data