Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10394 )

Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode
......................................................................


Patch Set 4:

(5 comments)

I still need to look at grouping-aggregator-* one more time but here's the 
comments for the rest.

http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc@310
PS4, Line 310:      if (FLAGS_use_legacy_aggregation) {
             :         *node = pool->Add(new PartitionedAggregationNode(pool, 
tnode, descs));
do we need to keep this code around?
it seems like the new stuff should be functionally, algorithmically, and 
performance equivalent, so how about we just delete the old code? it's hard to 
test both code paths and the risk here seems low enough that we can just delete 
the old one.


http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h
File be/src/exec/non-grouping-aggregator.h:

http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@103
PS4, Line 103:  'process_batch_fn_' or
             :   /// 'process_batch_no_grouping_fn_' will be updated with the 
codegened function
             :   /// depending on whether this is a grouping or non-grouping 
aggregation.
is that still accurate?


http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@107
PS4, Line 107: CodegenProcessBatch
also the naming seems inconsistent with ProcessBatchNoGrouping(). The 
NoGrouping part seems redundant with the class name now, but I don't mind 
either way if we can make it consistent with the CodegenFn naming.


http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc
File be/src/exec/non-grouping-aggregator.cc:

http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc@123
PS4, Line 123: ProcessBatchNoGrouping
given that this is no 1:1 with AddBatch(), maybe just call it AddBatchWork() to 
make that obvious (and do the renaming for codegen fn/vars)?


http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h
File be/src/exec/streaming-aggregation-node.h:

http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h@35
PS4, Line 35: f there is not enough memory available
I think it also stops aggregating if it's not seeing a good reduction? if so, 
maybe clarify with that.



-- 
To view, visit http://gerrit.cloudera.org:8080/10394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e
Gerrit-Change-Number: 10394
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Fri, 22 Jun 2018 21:56:41 +0000
Gerrit-HasComments: Yes

Reply via email to