[
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392289#comment-16392289
]
Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:29 AM:
-----------------------------------------------------------
Everything is moot if I can not prove my formula. But suppose it is correct –
I do think that COVAR_POP can be pushed down; it can be calculate from
SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through
table union and cross product, therefore can be pushed down over join.
Producing more candidate plans may be bad for CBO; but the extra rule (i.e.
singled sided) can be opt-in in some cases where metadata is missing, and stats
shows that group columns are unique or nearly unique.
was (Author: zhong.j.yu):
Everything is moot if I can not prove my formula. But suppose it is correct –
I do think that COVAR_POP can be pushed down; it can be calculate from
SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through
table union and cross product, therefore can be pushed down over join.
Producing more candidate plans may be bad for CBO; but the extra rule (i.e.
singled sided) can be opted in some cases where metadata is missing, or group
columns are nearly unique.
> Aggregate Join Push-down on a Single Side
> -----------------------------------------
>
> Key: CALCITE-2202
> URL: https://issues.apache.org/jira/browse/CALCITE-2202
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: next
> Reporter: Zhong Yu
> Assignee: Julian Hyde
> Priority: Major
> Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's
> apparent that aggregation can be pushed on on a single side (either side),
> and leave the other side non-aggregated, regardless of whether grouping
> columns are unique on the other side. My analysis –
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try
> to provide all 3 possible transformations (aggregate on left only; right
> only; both sides) to the cost based optimizer, so that the cheapest one can
> be chosen based on stats.
> Does this make any sense, anybody? If it sounds good, I'll implement it and
> offer a PR.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)