[ 
https://issues.apache.org/jira/browse/CALCITE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-1706.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.12.0

Fixed (by which I mean I disabled the rule) in 
http://git-wip-us.apache.org/repos/asf/calcite/commit/6b54b6ec.

> DruidAggregateFilterTransposeRule causes very fine-grained aggregations to be 
> pushed to Druid
> ---------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-1706
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1706
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>             Fix For: 1.12.0
>
>
> Enabling DruidAggregateFilterTransposeRule may cause very fine-grained 
> aggregations to be pushed to Druid.
> Running {{DruidAdapterIT.testFilterTimestamp}}, here is the previous plan 
> (with {{DruidAggregateFilterTransposeRule}} disabled):
> {noformat}
> EnumerableInterpreter
>   BindableAggregate(group=[{}], C=[COUNT()])
>     BindableFilter(condition=[AND(>=(/INT(Reinterpret($0), 86400000), 
> 1997-01-01), <(/INT(Reinterpret($0), 86400000), 1998-01-01), 
> OR(AND(>=(/INT(Reinterpret($0), 86400000), 1997-04-01), 
> <(/INT(Reinterpret($0), 86400000), 1997-05-01)), AND(>=(/INT(Reinterpret($0), 
> 86400000), 1997-06-01), <(/INT(Reinterpret($0), 86400000), 1997-07-01))))])
>       DruidQuery(table=[[foodmart, foodmart]], 
> intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], 
> projects=[[$0]])
> {noformat}
> Here is the (in my opinion inferior) plan with 
> {{DruidAggregateFilterTransposeRule}} enabled:
> {noformat}
> EnumerableInterpreter
>   BindableAggregate(group=[{}], C=[$SUM0($1)])
>     BindableFilter(condition=[AND(=(EXTRACT_DATE(FLAG(YEAR), 
> /INT(Reinterpret($0), 86400000)), 1997), OR(=(EXTRACT_DATE(FLAG(MONTH), 
> /INT(Reinterpret($0), 86400000)), 4), =(EXTRACT_DATE(FLAG(MONTH), 
> /INT(Reinterpret($0), 86400000)), 6)))])
>       DruidQuery(table=[[foodmart, foodmart]], 
> intervals=[[1900-01-09T00:00:00.000/2992-01-10T00:00:00.000]], groups=[{0}], 
> aggs=[[COUNT()]])
> {noformat}
> Note that the DruidQuery is aggregating on __timestamp. Given that 
> __timestamp is very high cardinality, is this an efficient operation for 
> Druid?
> For this particular query, the ideal would be to push the filter into the 
> {{intervals}} clause. Then we would not need to group by __timestamp. I am 
> not sure why this is not happening.
> [~nishantbangarwa], [~bslim], How bad is the query with 
> {{DruidAggregateFilterTransposeRule}} enabled, in your opinion? Is this a 
> show-stopper for Calcite 1.12?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to