Hi Alexander, On Mon, Jun 9, 2025 at 2:00 AM Alexander Lakhin <exclus...@gmail.com> wrote: > Please look at the following script, which falsifies an assertion
> TRAP: failed Assert("!BTScanPosIsPinned(so->currPos)"), File: "nbtutils.c", > Line: 3379, PID: 1621028 I can reproduce this. I think that it's an old bug. The problem is related to mark/restore (used by merge joins), where nbtree can sometimes independently hold a second pin on leaf pages (albeit very briefly, inside _bt_steppage). The cause of the failure of my recently-added assertion is that it assumes that so->dropPin always implies !BTScanPosIsPinned(so->currPos) at the start of _bt_killitems. I don't think that that assumption is unreasonable, even in light of this assertion problem. I think that the real problem is the way that _bt_killitems releases resources. _bt_killitems doesn't try to leave so->currPos as it was at the start of its call -- this is nothing new. The overall effect is that during so->dropPin _bt_steppage calls, we *generally* don't call IncrBufferRefCount in the "bump pin on current buffer for assignment to mark buffer" path, but *will* call IncrBufferRefCount when we happened to need to call _bt_killitems. Calling _bt_killitems shouldn't have these side-effects. I can make the assertion failure go away by teaching _bt_killitems to leave so->currPos in the same state that it was in when _bt_killitems was called. If there was no pin on the page when _bt_killitems was called, then there should be no pin held when it returns. See the attached patch. As I said, I think that this is actually an old bug. After all, _bt_killitems is required to test so->currPos.lsn against the same page's current LSN if the page pin was dropped since _bt_readpage was first called -- regardless of any other details. I don't think that it'll consistently do that in this hard-to-test mark/restore path on any Postgres version, so there might be a risk of failing to detect an unsafe concurrent TID recycling hazard. I've always thought that the whole idiom of testing BTScanPosIsPinned() in a bunch of places was very brittle. I wonder if it makes sense to totally replace those calls/tests with similar tests of the new so->dropPin field. -- Peter Geoghegan
v1-0001-Fix-issue-with-mark-restore-nbtree-pins.patch
Description: Binary data