Hi all,

I learned that the TID Scan node does not properly implement EvalPlanQual 
recheck, which has already been noted in comment added in 2002 (commit 
6799a6ca21). This does seem to have the potential to manifest in observable and 
surprising ways when queries are filtering on `ctid`, like for optimistic 
concurrency control as in https://stackoverflow.com/q/78487441.

Attached is an attempt from me at fixing this issue for TID Scan as well as TID 
Range Scan which appears to have the same issue.

This is my first time looking at the Postgres source (nor do I work with C on a 
regular basis), so I will not be surprised to hear that I've done things wrong, 
but I'm hopeful that this is usable as written. Running `meson test` passes for 
me, including the added isolation tests, and I confirmed that both of the tests 
that now result in a negative recheck were previously failing.

In my implementation I am calling TidRangeEval every time TidRangeRecheck is 
called; let me know if this is a performance concern. It didn't seem that any 
of the existing TidRangeScanState flags are quite right to know if the bounds 
have been initialized, but I am happy to add one.

The original issue appears to have been present since the introduction of TID 
scans in 1999 (commit 6f9ff92cc0) so I imagine it may make sense to backport 
the fix, although evidently not many people are running into this.

Thanks,
Sophie

Attachment: 0001-Fix-missing-EvalPlanQual-recheck-for-TID-scans.v1.patch
Description: Binary data

Reply via email to