[ 
https://issues.apache.org/jira/browse/PHOENIX-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213760#comment-15213760
 ] 

Julian Hyde commented on PHOENIX-2794:
--------------------------------------

Calcite does not, but it could, for MIN, MAX, SUM. (Or any aggregate function 
that rolls up using itself.) It would be wrong to do it for AVG or COUNT.

The rule would convert

{noformat}
Aggregate({x}, SUM(c1), MIN(c2), MAX(c3))
  Aggregate({x, y}, SUM(c1), MIN(c2), MAX(c3))
    R
{noformat}

into

{noformat}
Aggregate({x}, SUM(c1), MIN(c2), MAX(c3))
  R
{noformat}

It exploits the fact that you roll up SUM using SUM. AVG, COUNT do not have 
that property. To make your example work, you'd have to make sure that there 
was a project below the lower {{Aggregate}} computing {{TRUNC(ts, 'hour') AS 
y}}. The rule would not work if there was anything between the two 
{{Aggregate}} operators.

> Flatten nested aggregate queries when possible
> ----------------------------------------------
>
>                 Key: PHOENIX-2794
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2794
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> The following query:
> {code}
> SELECT TRUNC(ts,'HOUR'), AVG(avg_val)
> FROM (SELECT AVG(val),ts FROM T GROUP BY ts)
> GROUP BY TRUNC(ts,'HOUR');
> {code}
> will run much more efficiently if flattened so that the hourly bucketing is 
> done on the server-side like this:
> {code}
> SELECT TRUNC(ts,'HOUR'), AVG(val)
> FROM T
> GROUP BY TRUNC(ts,'HOUR');
> {code}
> We should flatten when possible. Not sure what the general rule is, but 
> perhaps if the inner and outer aggregate function matches, you can always do 
> this? Maybe only for some aggregate functions like SUM, MIN, MAX, AVG?
> This comes up in time series queries in particular.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to