[
https://issues.apache.org/jira/browse/DRILL-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Sinha updated DRILL-350:
-----------------------------
Description:
A query may have GROUP-BY keys that are not part of the SELECT list.. for
example: SELECT SUM(c1) FROM t1 GROUP BY a1, b1.
The Streaming Aggregate physical operator currently projects all GROUP-BY
columns with the assumption that a subsequent Project will drop the unnecessary
columns. This is sub-optimal because we incur the memory and cpu overhead of
populating the output record batch value vectors for those columns. Ideally,
the operator could keep track of the columns that are needed by the parent
(downstream) operator and only project those group-by columns.
> Streaming Aggregate physical operator projects columns that may not be needed
> -----------------------------------------------------------------------------
>
> Key: DRILL-350
> URL: https://issues.apache.org/jira/browse/DRILL-350
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Aman Sinha
>
> A query may have GROUP-BY keys that are not part of the SELECT list.. for
> example: SELECT SUM(c1) FROM t1 GROUP BY a1, b1.
> The Streaming Aggregate physical operator currently projects all GROUP-BY
> columns with the assumption that a subsequent Project will drop the
> unnecessary columns. This is sub-optimal because we incur the memory and cpu
> overhead of populating the output record batch value vectors for those
> columns. Ideally, the operator could keep track of the columns that are
> needed by the parent (downstream) operator and only project those group-by
> columns.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)