[jira] [Comment Edited] (CALCITE-2202) Aggregate Join Push-down on a Single Side
[ https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392289#comment-16392289 ] Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:29 AM: --- Everything is moot if I can not prove my formula. But suppose it is correct – I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join. Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opt-in in some cases where metadata is missing, and stats shows that group columns are unique or nearly unique. was (Author: zhong.j.yu): Everything is moot if I can not prove my formula. But suppose it is correct – I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join. Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opted in some cases where metadata is missing, or group columns are nearly unique. > Aggregate Join Push-down on a Single Side > - > > Key: CALCITE-2202 > URL: https://issues.apache.org/jira/browse/CALCITE-2202 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: next >Reporter: Zhong Yu >Assignee: Julian Hyde >Priority: Major > Fix For: next > > > While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's > apparent that aggregation can be pushed on on a single side (either side), > and leave the other side non-aggregated, regardless of whether grouping > columns are unique on the other side. My analysis – > [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] . > This may be useful when the metadata is insufficient; in any case, we may try > to provide all 3 possible transformations (aggregate on left only; right > only; both sides) to the cost based optimizer, so that the cheapest one can > be chosen based on stats. > Does this make any sense, anybody? If it sounds good, I'll implement it and > offer a PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CALCITE-2202) Aggregate Join Push-down on a Single Side
[ https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392289#comment-16392289 ] Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:27 AM: --- Everything is moot if I can not prove my formula. But suppose it is correct – I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join. Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opted in some cases where metadata is missing, or group columns are nearly unique. was (Author: zhong.j.yu): Everything is moot if I can not prove my formula. But suppose it is correct -- I do think that COVAR_POP can be pushed down; it can be calculate from SUM(x*y), SUM(x), SUM(y), COUNT(x,y), all of which can be split through table union and cross product, therefore can be pushed down over join. Producing more candidate plans may be bad for CBO; but the extra rule (i.e. singled sided) can be opted in some cases where metadata is missing, or group columns are nearly unique. > Aggregate Join Push-down on a Single Side > - > > Key: CALCITE-2202 > URL: https://issues.apache.org/jira/browse/CALCITE-2202 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: next >Reporter: Zhong Yu >Assignee: Julian Hyde >Priority: Major > Fix For: next > > > While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's > apparent that aggregation can be pushed on on a single side (either side), > and leave the other side non-aggregated, regardless of whether grouping > columns are unique on the other side. My analysis – > [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] . > This may be useful when the metadata is insufficient; in any case, we may try > to provide all 3 possible transformations (aggregate on left only; right > only; both sides) to the cost based optimizer, so that the cheapest one can > be chosen based on stats. > Does this make any sense, anybody? If it sounds good, I'll implement it and > offer a PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)