Hi, On 2023-02-04 11:10:55 -0800, Peter Geoghegan wrote: > On Sat, Feb 4, 2023 at 2:57 AM Andres Freund <and...@anarazel.de> wrote: > > Is there a good way to make breakage in the page recycling mechanism > > visible with gist? I guess to see corruption, I'd have to halt a scan > > before a page is visited with gdb, then cause the page to be recycled > > prematurely in another session, then unblock the first? Which'd then > > visit that page, thinking it to be in a different part of the tree than > > it actually is? > > Yes. This bug is similar to an ancient nbtree bug fixed back in 2012, > by commit d3abbbeb. > > > which clearly doesn't seem right. > > > > I just can't quite judge how bad that is. > > It's really hard to judge, even if you're an expert. We're talking > about a fairly chaotic scenario. My guess is that there is a very > small chance of a very unpleasant scenario if you have a GiST index > that has regular page deletions, and if you use > vacuum_defer_cleanup_age. It's likely that most GiST indexes never > have any page deletions due to the workload characteristics.
Thanks. Sounds like a problem here is too hard to repro. I mostly wanted to know how to be more confident about a fix working correctly. There's no tests for the whole page recycling behaviour, afaics, so it's a bit scary to change things around. I didn't quite feel confident pushing a fix for this just before a minor release, so I'll push once the minor releases are tagged. A quite minimal fix to GetFullRecentGlobalXmin() in 12-13 (returning FirstNormalTransactionId if epoch == 0 and RecentGlobalXmin > nextxid_xid), and the slightly larger fix in 14+. Greetings, Andres Freund