Hello Tim Armstrong, Alex Behm, Vuk Ercegovac, Dan Hecht,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10394

to look at the new patch set (#4).

Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode
......................................................................

IMPALA-110 (part 2): Refactor PartitionedAggregationNode

This patch refactors PartitionedAggregationNode in preparation for
supporting multiple distinct operators in a query.

The primary goal of the refactor is to separate out the core
aggregation functionality into a new type of object called an
Aggregator. For now, each aggregation ExecNode will contain a single
Aggregator. Then, future patches will extend the aggregation ExecNode
to support taking a single input and processing it with multiple
Aggregators, allowing us to support more exotic combinations of
aggregate functions and groupings.

Specifically, this patch splits PartitionedAggregationNode into five
new classes:
- Aggregator: a superclass containing the functionality that's shared
  between GroupingAggregator and NonGroupingAggregator.
- GroupingAggregator: this class contains the bulk of the interesting
  aggregation code, including everything related to creating and
  updating partitions and hash tables, spilling, etc.
- NonGroupingAggregator: this class handles the case of aggregations
  that don't have grouping exprs. Since these aggregations always
  result in just a single output row, the functionality here is
  relatively simple (eg. no spilling or streaming).
- StreamingAggregationNode: this node performs a streaming
  preaggregation, where the input is retrieved from the child during
  GetNext() and passed to the GroupingAggregator (non-grouping do not
  support streaming) Eventually, we'll support a list of
  GroupingAggregators.
- AggregationNode: this node performs a final aggregation, where the
  input is retrieved from the child during Open() and passed to the
  Aggregator. Currently the Aggregator can be either grouping or
  non-grouping. Eventually we'll support a list of GroupingAggregator
  and/or a single NonGroupingAggregator.

Testing:
- Ran the existing aggregation tests.

Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/common/global-flags.cc
M be/src/exec/CMakeLists.txt
A be/src/exec/aggregation-node.cc
A be/src/exec/aggregation-node.h
A be/src/exec/aggregator.cc
A be/src/exec/aggregator.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
A be/src/exec/grouping-aggregator-ir.cc
A be/src/exec/grouping-aggregator-partition.cc
A be/src/exec/grouping-aggregator.cc
A be/src/exec/grouping-aggregator.h
A be/src/exec/non-grouping-aggregator-ir.cc
A be/src/exec/non-grouping-aggregator.cc
A be/src/exec/non-grouping-aggregator.h
A be/src/exec/streaming-aggregation-node.cc
A be/src/exec/streaming-aggregation-node.h
19 files changed, 3,777 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/10394/4
--
To view, visit http://gerrit.cloudera.org:8080/10394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e
Gerrit-Change-Number: 10394
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to