While working on skip scan, I stumbled upon a bug on HEAD. This is an issue in my commit 5bf748b8, "Enhance nbtree ScalarArrayOp execution". The attached test case (repro_wrong_prim.sql) causes an assertion failure on HEAD. Here's the stack trace:
TRAP: failed Assert("so->keyData[opsktrig].sk_strategy != BTEqualStrategyNumber"), File: "../source/src/backend/access/nbtree/nbtutils.c", Line: 2475, PID: 1765589 [0x55942a24db8f] _bt_advance_array_keys: /mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtutils.c:2475 [0x55942a24bf22] _bt_checkkeys: /mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtutils.c:3797 [0x55942a244160] _bt_readpage: /mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtsearch.c:2221 [0x55942a2434ca] _bt_first: /mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtsearch.c:1888 [0x55942a23ef88] btgettuple: /mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtree.c:259 The problem is that _bt_advance_array_keys() doesn't take sufficient care at the point where it decides whether its call to _bt_check_compare against finaltup (with the scan direction flipped around) indicates that another primitive index scan is required. The final decision is conditioned on rules about how the scan key offset sktrig that triggered the call to _bt_advance_array_keys() relates to the scan key offset that was set by the _bt_check_compare finaltup comparison. This was fragile. It breaks with this test case because of fairly subtle conditions around when and how the arrays advance, the layout of the relevant leaf page, and the placement of inequality scan keys. When assertions are disabled, we do multiple primitive index scans that land on the same leaf page, which isn't supposed to be possible anymore. The query gives correct answers, but this behavior is definitely wrong (it is simply supposed to be impossible now, per 5bf748b8's commit message). Attached is a draft bug fix patch. It nails down the test by simply testing "so->keyData[opsktrig].sk_strategy != BTEqualStrategyNumber" directly, rather than comparing scan key offsets. This is a far simpler and far more direct approach. You might wonder why I didn't do it like this in the first place. It just worked out that way. The code in question was written before I changed the design of _bt_check_compare (in the draft patch that became commit 5bf748b8). Up until not that long before the patch was committed, _bt_check_compare would set "continuescan=false" for non-required arrays. That factor made detecting whether or not the relevant _bt_check_compare call had in fact encountered a required inequality of the kind we need to detect (to decide on whether to start another primitive index scan) difficult and messy. However, the final committed patch simplified _bt_check_compare, making the approach I've taken in the bug fix patch possible. I just never made the connection before now. -- Peter Geoghegan
v1-0001-Fix-nbtree-array-unsatisfied-inequality-check.patch
Description: Binary data
repro_wrong_prim.sql
Description: Binary data