[
https://issues.apache.org/jira/browse/CALCITE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian Hyde resolved CALCITE-1706.
----------------------------------
Resolution: Fixed
Fix Version/s: 1.12.0
Fixed (by which I mean I disabled the rule) in
http://git-wip-us.apache.org/repos/asf/calcite/commit/6b54b6ec.
> DruidAggregateFilterTransposeRule causes very fine-grained aggregations to be
> pushed to Druid
> ---------------------------------------------------------------------------------------------
>
> Key: CALCITE-1706
> URL: https://issues.apache.org/jira/browse/CALCITE-1706
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
> Fix For: 1.12.0
>
>
> Enabling DruidAggregateFilterTransposeRule may cause very fine-grained
> aggregations to be pushed to Druid.
> Running {{DruidAdapterIT.testFilterTimestamp}}, here is the previous plan
> (with {{DruidAggregateFilterTransposeRule}} disabled):
> {noformat}
> EnumerableInterpreter
> BindableAggregate(group=[{}], C=[COUNT()])
> BindableFilter(condition=[AND(>=(/INT(Reinterpret($0), 86400000),
> 1997-01-01), <(/INT(Reinterpret($0), 86400000), 1998-01-01),
> OR(AND(>=(/INT(Reinterpret($0), 86400000), 1997-04-01),
> <(/INT(Reinterpret($0), 86400000), 1997-05-01)), AND(>=(/INT(Reinterpret($0),
> 86400000), 1997-06-01), <(/INT(Reinterpret($0), 86400000), 1997-07-01))))])
> DruidQuery(table=[[foodmart, foodmart]],
> intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]],
> projects=[[$0]])
> {noformat}
> Here is the (in my opinion inferior) plan with
> {{DruidAggregateFilterTransposeRule}} enabled:
> {noformat}
> EnumerableInterpreter
> BindableAggregate(group=[{}], C=[$SUM0($1)])
> BindableFilter(condition=[AND(=(EXTRACT_DATE(FLAG(YEAR),
> /INT(Reinterpret($0), 86400000)), 1997), OR(=(EXTRACT_DATE(FLAG(MONTH),
> /INT(Reinterpret($0), 86400000)), 4), =(EXTRACT_DATE(FLAG(MONTH),
> /INT(Reinterpret($0), 86400000)), 6)))])
> DruidQuery(table=[[foodmart, foodmart]],
> intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], groups=[{0}],
> aggs=[[COUNT()]])
> {noformat}
> Note that the DruidQuery is aggregating on __timestamp. Given that
> __timestamp is very high cardinality, is this an efficient operation for
> Druid?
> For this particular query, the ideal would be to push the filter into the
> {{intervals}} clause. Then we would not need to group by __timestamp. I am
> not sure why this is not happening.
> [~nishantbangarwa], [~bslim], How bad is the query with
> {{DruidAggregateFilterTransposeRule}} enabled, in your opinion? Is this a
> show-stopper for Calcite 1.12?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)