Jim Apple has posted comments on this change. Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations ......................................................................
Patch Set 3: (15 comments) http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node-ir.cc File be/src/exec/partitioned-aggregation-node-ir.cc: Line 130: if (group_count_ >= max_groups_) return Status::OK(); > This will also give incorrect results for spilling aggregations. If AGGREGA Done Line 184: const int cache_size = expr_vals_cache->capacity(); > I don't think there's any advantage to cross-compiling this code, so it'd b Done PS1, Line 185: r (int gro > This is overloading this variable. I'd rather keep the invariant that child Done Line 250: // Avoid repeatedly trying to add tuples when under memory pressure. > I think this should should actually occur before the remaining_capacity che Done http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS1, Line 330: here is grouping, so we will do partitioned aggregation. > I think it would be good to factor this out into an inline function, e.g. I Done PS1, Line 331: (pr > I think this is overloading the meaning of this 'eos' variable. Maybe renam Done PS1, Line 506: > Probably in the header with max_groups_. I think we should also add a DCHEC Done Line 519: DCHECK(!child_eos_); > Yep exactly. Before this change it didn't need a limit check because the pl Documentation added on to new member variables. Line 573: > I think we can move the check from ProcessBatchStreaming() to either after Done http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node.h File be/src/exec/partitioned-aggregation-node.h: Line 295: // ExecNode. max_groups_ equals limit_ if limit_ is non-negative, and is the maximum > Document that it's always a positive value, and set to the max value of int Done http://gerrit.cloudera.org:8080/#/c/3822/2/be/src/exec/partitioned-aggregation-node.h File be/src/exec/partitioned-aggregation-node.h: PS2, Line 296: // int6 > const? Done PS2, Line 339: aggregated_parti > A nice new feature of c++11, but since we initialize in the initialization Done http://gerrit.cloudera.org:8080/#/c/3822/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java: Line 806: if (!node.getAggInfo().isLimitable()) { > This needs a comment to explain the intent. Or, maybe better, factor this c Done PS1, Line 807: node.unse > || normally goes on the previous line. removed PS1, Line 911: de.unsetLimit(); : } > Make this a method of AggInfo? Done -- To view, visit http://gerrit.cloudera.org:8080/3822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
