[jira] [Commented] (CALCITE-2202) Aggregate Join Push-down on a Single Side

Julian Hyde (JIRA) Thu, 08 Mar 2018 15:17:22 -0800

    [ 
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392086#comment-16392086
 ]


Julian Hyde commented on CALCITE-2202:
--------------------------------------

Maybe I've been out of academia and in engineering for too long, but I prefer a 
few well-chosen examples than a formal proof. Some examples stake out the 
corner cases (e.g. empty group by) and other examples are easily generalized 
(e.g. if something applies to MIN it applies to MAX and similar functions, and 
if something that applies to AVG it applies to STDEV and similar functions).

Are you claiming that it is better to push down on only one side? (I could see 
how it would be simpler, therefore better, if you could push down to the left 
first, then push down to the right.) I contend that the desired result is 
{code}select sum(e.s * d.c)
from (select deptno, sum(sal) as s from emp group by deptno) as e
join (select deptno, count(*) as c from dept group by deptno) as d
on e.deptno = d.deptno
group by e.deptno{code} (I mistakenly omitted the "group by deptno" in the two 
inner queries last time) and I wonder whether you could get to that by applying 
your methods.

Most aggregate functions we are familiar with have one argument, but would your 
methods apply to aggregate functions with more than one? Let's consider COUNT 
and COVAR_POP, both of which can take two arguments, and ask whether they can 
be pushed down if they have one argument from each side of a join.

> Aggregate Join Push-down on a Single Side
> -----------------------------------------
>
>                 Key: CALCITE-2202
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2202
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: next
>            Reporter: Zhong Yu
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's 
> apparent that aggregation can be pushed on on a single side (either side), 
> and leave the other side non-aggregated, regardless of whether grouping 
> columns are unique on the other side. My analysis – 
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try 
> to provide all 3 possible transformations (aggregate on left only; right 
> only; both sides) to the cost based optimizer, so that the cheapest one can 
> be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and 
> offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2202) Aggregate Join Push-down on a Single Side

Reply via email to