Hi Richard
      Thanks for working on this.  I think the overall idea makes sense: if
a
a strict qual inside the RHS subtree proves that a RHS Var is non-null for
every row emitted by that subtree, then using that proof to recognize the
LEFT JOIN ... IS NULL case as an anti join seems valid .

The propagation rules also look conservative: inner/semi joins can carry
proofs from both sides plus their own quals, while outer joins only carry
proofs from the non-nullable side.

Maybe I'm overthinking this, but one implementation detail worth considering
is whether pass1 should store the derived proof directly, rather than
carrying
the original safe qual lists around.  `safe_quals` seems to be used later
only
to call `find_nonnullable_vars()`, so the state could perhaps keep something
like:

```
List *safe_nonnullable_vars;

then, when a safe quail is found:

result->safe_nonnullable_vars =
    mbms_add_members(result->safe_nonnullable_vars,
                     find_nonnullable_vars(f->quals));
And in the anti-join check:

nonnullable_vars =
    mbms_add_members(NIL, right_state->safe_nonnullable_vars);
nonnullable_vars =
    mbms_add_members(nonnullable_vars,
                     find_nonnullable_vars(j->quals));
That would keep the pass1 state closer to what pass2 actually needs, and
avoid
carrying/concatenating implicit-AND qual lists only to rescan them later.
One
small detail is that mbms_add_members() mutates its first argument, so when
propagating a child state through an outer join we'd want to copy it, e.g.
with
mbms_add_members(NIL, child_state->safe_nonnullable_vars), rather than
assign
the child list directly.
This is not a correctness objection to the current patch, just a possible
way
to make the stored state match its only consumer more closely.


Thanks

On Fri, Jun 5, 2026 at 3:50 PM Richard Guo <[email protected]> wrote:

> On Fri, Jun 5, 2026 at 12:00 PM Isaac Morland <[email protected]>
> wrote:
> > On Thu, 4 Jun 2026 at 22:55, Richard Guo <[email protected]> wrote:
> >> reduce_outer_joins() already recognizes that a LEFT JOIN is really an
> >> anti-join when an upper qual forces a nullable-side Var to be NULL
> >> while that Var is actually non-null in every matching row.  In that
> >> case only the null-extended (unmatched) rows can satisfy the upper
> >> qual, which is exactly anti-join semantics, so we switch JOIN_LEFT to
> >> JOIN_ANTI.  This is worth detecting because an anti-join is usually
> >> much cheaper than computing the outer join and then filtering the
> >> result with the IS NULL clause.
>
> > While you’re at it, any chance of changing it so that "a LEFT JOIN b
> WHERE b IS NULL" is guaranteed not to have worse performance than " a LEFT
> JOIN b WHERE b.f IS NULL"? I've had this and asked about it here and it was
> suggested that I should specify the primary key fields of b; but surely if
> I want an antijoin it's clearer and better implicit documentation if I
> don't zero in on specific fields. I think somebody else is suggesting "a
> LEFT ANTI JOIN b" which would be even better.
>
> The reason the planner doesn't recognize "WHERE b IS NULL" during
> outer-join reduction is that find_forced_null_var() skips NullTest
> nodes with argisrow=true, which is what the parser produces for a
> whole-row IS NULL test.  So the anti-join reduction never sees it.
>
> To handle it we'd need to decompose the whole-row test into per-column
> checks, and that expansion needs the relation's column list, which
> find_forced_null_var() doesn't have.  One option would be to let
> find_forced_null_var() recognize the whole-row IS NULL and hand back
> the whole-row Var, then have reduce_outer_joins() expand it into the
> relation's columns before running the non-null proof.
>
> That said, I'm not sure it's worth the churn.  IMHO, the whole-row IS
> NULL test seems less common than the column form.  If you want an
> anti-join, it seems more natural to just write NOT EXISTS.  For the
> same reason, I'm not convinced a "LEFT ANTI JOIN" would buy us much
> over NOT EXISTS.
>
> - Richard
>
>
>

Reply via email to