Re: Convert NOT IN sublinks to anti-joins when safe

Richard Guo Wed, 04 Feb 2026 01:48:05 -0800

On Tue, Feb 3, 2026 at 4:12 PM Richard Guo <[email protected]> wrote:
> This topic has been discussed several times in the past.  Due to the
> semantic mismatch regarding NULL handling, NOT IN is not ordinarily
> equivalent to an anti-join.  However, if we can prove that neither the
> outer expressions nor the subquery outputs can yield NULL values, it
> should be safe to convert NOT IN to an anti-join.


I've noticed a loose end in the v1 patch.

The semantic gap between NOT IN and anti-join actually exists whenever
the operator returns NULL.  For NOT IN, if (A op B) returns NULL, then
NOT (NULL) evaluates to NULL (effectively false), and the row is
discarded.  In contrast, for an anti-join, if (A op B) returns NULL,
it implies no match was found, and the anti-join logic dictates that
the row should be kept.

To guarantee that (A op B) never returns NULL, the current patch
verifies that both A and B are non-nullable.  However, this is not
sufficient.  The "op" might be an operator that returns NULL on
non-null inputs.

On the other hand, if "op" does not return NULL on NULL inputs, like
IS DISTINCT FROM, we technically would not even need to require that A
and B are non-nullable.

Is there a convenient way to verify that an operator never returns
NULL on non-null inputs?  Would it be sufficient to insist that the
operator belongs to btree opclass (assuming that the strict ordering
requirements of btree imply this safety)?

And, is it worth checking if an operator never returns NULL even on
NULL inputs?  If we can identify such operators, we should be able to
remove the requirement that both sides of NOT IN must be non-nullable.
Is there a convenient way to check for such operators?

- Richard

Re: Convert NOT IN sublinks to anti-joins when safe

Reply via email to