Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10394 )
Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode ...................................................................... Patch Set 4: (5 comments) I still need to look at grouping-aggregator-* one more time but here's the comments for the rest. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/exec-node.cc@310 PS4, Line 310: if (FLAGS_use_legacy_aggregation) { : *node = pool->Add(new PartitionedAggregationNode(pool, tnode, descs)); do we need to keep this code around? it seems like the new stuff should be functionally, algorithmically, and performance equivalent, so how about we just delete the old code? it's hard to test both code paths and the risk here seems low enough that we can just delete the old one. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h File be/src/exec/non-grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@103 PS4, Line 103: 'process_batch_fn_' or : /// 'process_batch_no_grouping_fn_' will be updated with the codegened function : /// depending on whether this is a grouping or non-grouping aggregation. is that still accurate? http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.h@107 PS4, Line 107: CodegenProcessBatch also the naming seems inconsistent with ProcessBatchNoGrouping(). The NoGrouping part seems redundant with the class name now, but I don't mind either way if we can make it consistent with the CodegenFn naming. http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc File be/src/exec/non-grouping-aggregator.cc: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/non-grouping-aggregator.cc@123 PS4, Line 123: ProcessBatchNoGrouping given that this is no 1:1 with AddBatch(), maybe just call it AddBatchWork() to make that obvious (and do the renaming for codegen fn/vars)? http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h File be/src/exec/streaming-aggregation-node.h: http://gerrit.cloudera.org:8080/#/c/10394/4/be/src/exec/streaming-aggregation-node.h@35 PS4, Line 35: f there is not enough memory available I think it also stops aggregating if it's not seeing a good reduction? if so, maybe clarify with that. -- To view, visit http://gerrit.cloudera.org:8080/10394 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e Gerrit-Change-Number: 10394 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Fri, 22 Jun 2018 21:56:41 +0000 Gerrit-HasComments: Yes
