On Tue, Dec 5, 2023 at 4:53 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Hmm ... I had not paid any attention to this commit, but the rationale > given in the commit message is just flat wrong: > > Imagine the ordered B-tree scan for the query like this. > > SELECT * FROM tbl WHERE col > 'a' AND col < 'b' ORDER BY col; > > The (col > 'a') scan key will be always matched once we find the location > to > start the scan. The (col < 'b') scan key will match every item on the > page > as long as it matches the last item on the page. > > That argument probably holds for the index's first column, but it is > completely and obviously wrong for every following column. Nonetheless > it looks like we're trying to apply the optimization to every scan key.
Just to be clear, you're raising a concern that seems to me to apply to "the other optimization" from the same commit, specifically -- the precheck optimization. Not the one I found a problem in. (They're closely related but distinct optimizations.) I *think* that that part is handled correctly, because non-required scan keys are not affected (by either optimization). I have no specific reason to doubt the proposition that 'b' could only be marked required in situations where it is indeed safe to assume that the col < 'b' condition must also apply to earlier tuples transitively (i.e. this must be true because it was true for the the final tuple on the page during the _bt_readpage precheck). That being said, I wouldn't rule out problems for the precheck optimization in the presence of opfamilies like the one from my test case. I didn't get as far as exploring that side of things, at least. -- Peter Geoghegan