Richard:

Thanks for reviving this patch and for all of your work on it! Eager
aggregation pushdown will be beneficial for my work and I'm hoping to see
it land.


I was playing around with v9 of the patches and was specifically curious
about this previous statement...

>This patch also makes eager aggregation work with outer joins.  With
>outer join, the aggregate cannot be pushed down if any column referenced
>by grouping expressions or aggregate functions is nullable by an outer
>join above the relation to which we want to apply the partiall
>aggregation.  Thanks to Tom's outer-join-aware-Var infrastructure, we
>can easily identify such situations and subsequently refrain from
>pushing down the aggregates.

 ...and this related comment in eager_aggregate.out:

>-- Ensure aggregation cannot be pushed down to the nullable side

While I'm new to this work and its subtleties, I'm wondering if this is too
broad a condition.

I modified the first test query in eager_aggregate.sql to make it a LEFT
JOIN and eager aggregation indeed did not happen, which is expected based
on the comments upthread.

query:
SET enable_eager_aggregate=ON;
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.a, sum(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON
t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;

plan:
                         QUERY PLAN
------------------------------------------------------------
 GroupAggregate
   Output: t1.a, sum(t2.c)
   Group Key: t1.a
   ->  Sort
         Output: t1.a, t2.c
         Sort Key: t1.a
         ->  Hash Right Join
               Output: t1.a, t2.c
               Hash Cond: (t2.b = t1.b)
               ->  Seq Scan on public.eager_agg_t2 t2
                     Output: t2.a, t2.b, t2.c
               ->  Hash
                     Output: t1.a, t1.b
                     ->  Seq Scan on public.eager_agg_t1 t1
                           Output: t1.a, t1.b
(15 rows)

(NOTE: I changed the aggregate from avg(...) to sum(...) for simplicity)

But, it seems that eager aggregation for the query above can be
"replicated" as:

query:

EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.a, sum(t2.c)
FROM eager_agg_t1 t1
LEFT JOIN (
    SELECT b, sum(c) c
    FROM eager_agg_t2 t2p
    GROUP BY b
) t2 ON t1.b = t2.b
GROUP BY t1.a
ORDER BY t1.a;

The output of both the original query and this one match (and the plans
with eager aggregation and the subquery are nearly identical if you restore
the LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but does
this mean that there are conditions under which eager aggregation can be
pushed down to the nullable side?


-Paul-

On Sat, Jul 6, 2024 at 4:56 PM Richard Guo <guofengli...@gmail.com> wrote:

> On Thu, Jun 13, 2024 at 4:07 PM Richard Guo <guofengli...@gmail.com>
> wrote:
> > I spent some time testing this patchset and found a few more issues.
> > ...
>
> > Hence here is the v8 patchset, with fixes for all the above issues.
>
> I found an 'ORDER/GROUP BY expression not found in targetlist' error
> with this patchset, with the query below:
>
> create table t (a boolean);
>
> set enable_eager_aggregate to on;
>
> explain (costs off)
> select min(1) from t t1 left join t t2 on t1.a group by (not (not
> t1.a)), t1.a order by t1.a;
> ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> This happens because the two grouping items are actually the same and
> standard_qp_callback would remove one of them.  The fully-processed
> groupClause is kept in root->processed_groupClause.  However, when
> collecting grouping expressions in create_grouping_expr_infos, we are
> checking parse->groupClause, which is incorrect.
>
> The fix is straightforward: check root->processed_groupClause instead.
>
> Here is a new rebase with this fix.
>
> Thanks
> Richard
>

Reply via email to