I've recently been investigating improving our plans for queries like: SELECT * FROM t WHERE t.foo IN (1, 2..1000); where the table "t" has a partial index on "foo" where "foo IS NOT NULL".
Currently the planner generates an index [only] scan so long as the number of items in the IN expression is <= 100, but as soon as you add the 101st item it reverts to seq scan. If we add the explicit null check like: SELECT * FROM t WHERE t.foo IN (1, 2..1000) AND foo IS NOT NULL; then we go back to the desired index scan. This occurs because predtest.c stops expanding ScalarArrayOpExpr's with constant array arguments into OR trees when the array size is > 100. The rest of the predicate proving code then becomes unable to infer that foo is not null and therefore the planner cannot prove that the partial index is correct to use. (Please pardon technical details in the below background that may be way off; I don't have a lot of experience with the Postgres codebase yet, and am still trying to build a mental model of things.) At first I was imagining having the parse keep track of whether an array const expr contained any nulls and perhaps adding generated quals (in an equivalence class?) to allow the planner to easily prove the index was correct. I'd been going down this track because in my mind the issue was because the planner needed to verify whether all of the array elements were not null. But as I started to dig into the predtest.c NOT NULL proofs and add test cases, I realized that at least in many normal op cases we can safely infer that foo is not null when "foo <op> <array>" is true even if the array contains null elements. This is such a simple change that it seems like I must be missing a case where the above doesn't hold true, but I can't immediately think of any, and indeed with the attached patch all existing tests pass (including some additional ones I added for predtest to play around with it). Am I missing something obvious? Is this a valid approach? Other outstanding questions: Should I add additional tests for predtest? It already seems to cover some null test cases with scalar array ops, but I'd be happy to add more if desired. Should I add a test case for the resulting plan with "foo IN (...)" with an array with more than 100 elements? Thanks, James Coleman
saop_is_not_null-v1.patch
Description: Binary data