Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10394 )
Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode ...................................................................... IMPALA-110 (part 2): Refactor PartitionedAggregationNode This patch refactors PartitionedAggregationNode in preparation for supporting multiple distinct operators in a query. The primary goal of the refactor is to separate out the core aggregation functionality into a new type of object called an Aggregator. For now, each aggregation ExecNode will contain a single Aggregator. Then, future patches will extend the aggregation ExecNode to support taking a single input and processing it with multiple Aggregators, allowing us to support more exotic combinations of aggregate functions and groupings. Specifically, this patch splits PartitionedAggregationNode into five new classes: - Aggregator: a superclass containing the functionality that's shared between GroupingAggregator and NonGroupingAggregator. - GroupingAggregator: this class contains the bulk of the interesting aggregation code, including everything related to creating and updating partitions and hash tables, spilling, etc. - NonGroupingAggregator: this class handles the case of aggregations that don't have grouping exprs. Since these aggregations always result in just a single output row, the functionality here is relatively simple (eg. no spilling or streaming). - StreamingAggregationNode: this node performs a streaming preaggregation, where the input is retrieved from the child during GetNext() and passed to the GroupingAggregator (non-grouping do not support streaming) Eventually, we'll support a list of GroupingAggregators. - AggregationNode: this node performs a final aggregation, where the input is retrieved from the child during Open() and passed to the Aggregator. Currently the Aggregator can be either grouping or non-grouping. Eventually we'll support a list of GroupingAggregator and/or a single NonGroupingAggregator. Testing: - Passed a full exhaustive run. Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Reviewed-on: http://gerrit.cloudera.org:8080/10394 Reviewed-by: Thomas Marshall <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/CMakeLists.txt A be/src/exec/aggregation-node.cc A be/src/exec/aggregation-node.h A be/src/exec/aggregator.cc A be/src/exec/aggregator.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h R be/src/exec/grouping-aggregator-ir.cc A be/src/exec/grouping-aggregator-partition.cc A be/src/exec/grouping-aggregator.cc R be/src/exec/grouping-aggregator.h A be/src/exec/non-grouping-aggregator-ir.cc A be/src/exec/non-grouping-aggregator.cc A be/src/exec/non-grouping-aggregator.h D be/src/exec/partitioned-aggregation-node.cc A be/src/exec/streaming-aggregation-node.cc A be/src/exec/streaming-aggregation-node.h 19 files changed, 3,054 insertions(+), 2,239 deletions(-) Approvals: Thomas Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10394 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Gerrit-Change-Number: 10394 Gerrit-PatchSet: 13 Gerrit-Owner: Thomas Marshall <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
