On Sun, May 31, 2026 at 8:28 PM Tender Wang <[email protected]> wrote:
> Radim Marek <[email protected]> 于2026年5月29日周五 23:55写道:
> > == How to reproduce
> >
> >   CREATE TEMP TABLE c(id int, country text);
> >   CREATE TEMP TABLE o(customer_id int);
> >   INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE');
> >   INSERT INTO o VALUES (1),(3);   -- only customers 1 and 3 have a row in o
> >
> >   SELECT c.country, count(*) AS n
> >   FROM c
> >   WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
> >   GROUP BY c.country
> >   ORDER BY c.country;
> >
> > Expected results (everywhere except master)
> >
> >  country | n
> > ---------+---
> >  DE      | 2
> >  US      | 1
> > (2 rows)
> >
> > The actual result with enable_eager_aggregate = on (default)
> >
> >  country | n
> > ---------+---
> >  DE      | 0
> >  US      | 0
> > (2 rows)

Thanks for the report.  This is a bug.  We should never push a partial
aggregation down to a relation on the inner (RHS) side of a semi/anti
join.  A semi/anti join does not preserve its inner rows in the join
output, so a partial aggregate computed on the inner side would not
survive the join and could not be combined by the final aggregation.

> I haven't thought about it too deeply yet. Maybe we can do something
> in the make_grouped_join_rel().
> ...
> if (sjinfo->jointype == JOIN_ANTI || sjinfo->jointype == JOIN_SEMI)
>     return;
> ...

That does fix the reported case, but I think it's too broad: it also
disables pushing a partial aggregate to the outer side of a semi/anti
join, which is valid.  And by the time we reach make_grouped_join_rel
the grouped relation for the inner-side relation has already been
built, so it would just go unused.

So I'd rather fix it in eager_aggregation_possible_for_relation, right
next to the existing outer-join check, by rejecting a relation that
lies on the inner side of a semijoin/antijoin.  See attached.

- Richard

Attachment: v1-0001-Fix-eager-aggregation-for-semi-antijoin-inner-rel.patch
Description: Binary data

Reply via email to