Alex Behm has posted comments on this change.

Change subject: IMPALA-5160: adjust spill buffer size based on planner estimates
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

Line 314:         bufferSize = Math.min(bufferSize, 
Math.max(minSpillableBufferBytes_,
Estimates are still conservative but more accurate than before - great! The one 
place where I think we are still too aggressive is 
PlanFragment.getNumInstances(). We "blindly" assume that we can get the maximum 
MT_DOP, but I think it will be common not to. The effective MT_DOP is limited 
by the number of scan ranges in the left-most scan, so ideally we'd take that 
into account in PlanFragment.getNumInstances(). I don't feel strongly on doing 
this now or later, but it's something to be aware of.


http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

Line 186:         || numNodes_ == 0) {
What about numNodes_ == -1?


Line 190:       perInstanceDataBytes = (long) Math.ceil(getChild(1).cardinality_
For broadcast, it seems we should count the full HT size only once per node and 
not per instance.


Line 193:         perInstanceDataBytes /= 
fragment_.getNumInstances(queryOptions.getMt_dop());
getNumInstances() could return -1


http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/util/BitUtil.java
File fe/src/main/java/org/apache/impala/util/BitUtil.java:

Line 1: package org.apache.impala.util;
Apache header


http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

Line 409:     // vary the spillable buffer size..
trailing "."


Line 415:     PlanNode.minSpillableBufferBytes_ = 64 * 1024;
Would it make sense to move this into RuntimeEnv. Setting a static variable in 
PlanNode seems a little weird.


-- 
To view, visit http://gerrit.cloudera.org:8080/6963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to