Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

James Coleman Mon, 14 Jan 2019 08:25:40 -0800

On Mon, Jan 14, 2019 at 11:08 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> James Coleman <jtc...@gmail.com> writes:
> > On Sun, Jan 13, 2019 at 3:06 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> There's still a logical problem here, which is that in order to
> >> prove that the ScalarArrayOpExpr yields NULL when its LHS does,
> >> you also have to prove that the RHS is not an empty array.
>
> > I've constructed a test case running the test_predtest function on the
> > expression
> > "select x is null, x = any(array[]::int[]) from integers" which I
> > think gets at the logic bug you're noting. In doing this I noticed
> > something I don't fully understand: on master this shows that "x =
> > any(array[]::int[])" proves "x is null" (while my patch shows the
> > opposite). Is this because the second expression being false means
> > there's can be no valid value for x? It also claims it refutes the
> > same, which is even more confusing.
>
> That sounds like a bug, but I think it's actually OK because a vacuous
> OR is necessarily false, and a false predicate "implies" any conclusion
> at all according to the customary definition of implication.  If there
> are any real optimization bugs in this area, it's probably a result of
> calling code believing more than it should about the meaning of a proof.
>
> I think these test cases don't actually prove much about the behavior
> of your patch.  Wouldn't they be handled by the generic OR-conversion
> logic, since there's fewer than MAX_SAOP_ARRAY_SIZE items?


Which ones are you looking at in particular? The "inline" (non-cte)
happy and sad path cases have 102 total array elements (as does the
happy path cte case), and MAX_SAOP_ARRAY_SIZE is 100. The other two
tests are about the empty array case so much have 0 items (and were
the ones that showed different behavior between v3 and v4).

> > It seems to me that this would mean that an IS NOT NULL index would
> > still not be proven to be usable for a non-constant array (e.g., from
> > a subquery calculated array), and in fact I've included a (failing)
> > test demonstrating that fact in the attached patch. This feeds into my
> > question above about there possibly being another corollary/proof that
> > could be added for this case.
>
> Hm.  That would be annoying, but on reflection I think maybe we don't
> actually need to worry about emptiness of the array after all.  The
> thing that we need to prove, at least for the implication case, is
> "truth of the ScalarArrayOp implies truth of the IS NOT NULL clause".
> Per above, if the ScalarArrayOp is necessarily false, then we can
> claim that the implication holds.  However, we need to work through
> all four proof rules (strong vs. weak, implication vs refutation)
> and see which ones this applies to --- I'm not sure offhand that
> the logic works for all four.  (ENOCAFFEINE...)  In any case it
> requires thinking a bit differently about what clause_is_strict_for()
> is really doing.

Are you thinking that implies clause_is_strict_for might not be the
right/only place? Or just that the code in that function needs to
consider if it should return different results in some cases to handle
all 4 proof rules properly?

> > Semi-related: I noticed that predicate_classify handles an ArrayExpr
> > by using list_length; this seems unnecessary if we know we can fail
> > fast. I realize it's not a common problem, but I have seen extremely
> > long arrays before, and maybe we should fall out of the count once we
> > hit max+1 items.
>
> Huh?  list_length() is constant-time.

Facepalm. I'd somehow had it stuck in my head that we actually
iterated the list to count.

James Coleman

Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

Reply via email to