James Taylor created PHOENIX-2794:
-------------------------------------
Summary: Optimize aggregates of aggregates when possible
Key: PHOENIX-2794
URL: https://issues.apache.org/jira/browse/PHOENIX-2794
Project: Phoenix
Issue Type: Bug
Reporter: James Taylor
The following query:
{code}
SELECT TRUNC(ts,'HOUR'), AVG(avg_val)
FROM (SELECT AVG(val),ts FROM T GROUP BY ts)
GROUP BY TRUNC(ts,'HOUR');
{code}
will run much more efficiently if flattened so that the hourly bucketing is
done on the server-side like this:
{code}
SELECT TRUNC(ts,'HOUR'), AVG(val)
FROM T
GROUP BY TRUNC(ts,'HOUR');
{code}
We should flatten when possible. Not sure what the general rule is, but perhaps
if the inner and outer aggregate function matches, you can always do this?
Maybe only for some aggregate functions like SUM, MIN, MAX, AVG?
This comes up in time series queries in particular.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)