Jim Apple has posted comments on this change.

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
......................................................................


Patch Set 3:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node-ir.cc
File be/src/exec/partitioned-aggregation-node-ir.cc:

Line 130:   if (group_count_ >= max_groups_) return Status::OK();
> This will also give incorrect results for spilling aggregations. If AGGREGA
Done


Line 184:   const int cache_size = expr_vals_cache->capacity();
> I don't think there's any advantage to cross-compiling this code, so it'd b
Done


PS1, Line 185: r (int gro
> This is overloading this variable. I'd rather keep the invariant that child
Done


Line 250:       // Avoid repeatedly trying to add tuples when under memory 
pressure.
> I think this should should actually occur before the remaining_capacity che
Done


http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS1, Line 330: here is grouping, so we will do partitioned aggregation.
> I think it would be good to factor this out into an inline function, e.g. I
Done


PS1, Line 331: (pr
> I think this is overloading the meaning of this 'eos' variable. Maybe renam
Done


PS1, Line 506: 
> Probably in the header with max_groups_. I think we should also add a DCHEC
Done


Line 519:   DCHECK(!child_eos_);
> Yep exactly. Before this change it didn't need a limit check because the pl
Documentation added on to new member variables.


Line 573: 
> I think we can move the check from ProcessBatchStreaming() to either after 
Done


http://gerrit.cloudera.org:8080/#/c/3822/1/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

Line 295:   // ExecNode. max_groups_ equals limit_ if limit_ is non-negative, 
and is the maximum
> Document that it's always a positive value, and set to the max value of int
Done


http://gerrit.cloudera.org:8080/#/c/3822/2/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

PS2, Line 296: // int6
> const?
Done


PS2, Line 339: aggregated_parti
> A nice new feature of c++11, but since we initialize in the initialization 
Done


http://gerrit.cloudera.org:8080/#/c/3822/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 806:     if (!node.getAggInfo().isLimitable()) {
> This needs a comment to explain the intent. Or, maybe better, factor this c
Done


PS1, Line 807: node.unse
> || normally goes on the previous line.
removed


PS1, Line 911: de.unsetLimit();
             :       }
> Make this a method of AggInfo?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to