Matthias van de Meent <boekewurm+postg...@gmail.com> writes: > The only two existing mechanisms that I could find (in the access/heap > directory) that possibly could fail on shrunken line pointer arrays; > being xlog recovery (I do not have enough knowledge on recovery to > determine if that may touch pages that have shrunken line pointer > arrays, or if those situations won't exist due to never using dirtied > pages in recovery) and backwards table scans on non-MVCC snapshots > (which would be fixed in the attached patch).
I think you're not visualizing the problem properly. The case I was concerned about back when is that there are various bits of code that may visit a page with a predetermined TID in mind to look at. An index lookup is an obvious example, and another one is chasing an update chain's t_ctid link. You might argue that if the tuple was dead enough to be removed, there should be no such in-flight references to worry about, but I think such an assumption is unsafe. There is not, for example, any interlock that ensures that a process that has obtained a TID from an index will have completed its heap visit before a VACUUM that subsequently removed that index entry also removes the heap tuple. So, to accept a patch that shortens the line pointer array, what we need to do is verify that every such code path checks for an out-of-range offset before trying to fetch the target line pointer. I believed back in 2007 that there were, or once had been, code paths that omitted such a range check, assuming that they could trust the TID they had gotten from $wherever to point at an extant line pointer array entry. Maybe things are all good now, but I think you should run around and examine every place that checks for tuple deadness to see if the offset it used is known to be within the current page bounds. regards, tom lane