Alex Behm has posted comments on this change. Change subject: IMPALA-5160: adjust spill buffer size based on planner estimates ......................................................................
Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: Line 314: bufferSize = Math.min(bufferSize, Math.max(minSpillableBufferBytes_, Estimates are still conservative but more accurate than before - great! The one place where I think we are still too aggressive is PlanFragment.getNumInstances(). We "blindly" assume that we can get the maximum MT_DOP, but I think it will be common not to. The effective MT_DOP is limited by the number of scan ranges in the left-most scan, so ideally we'd take that into account in PlanFragment.getNumInstances(). I don't feel strongly on doing this now or later, but it's something to be aware of. http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java: Line 186: || numNodes_ == 0) { What about numNodes_ == -1? Line 190: perInstanceDataBytes = (long) Math.ceil(getChild(1).cardinality_ For broadcast, it seems we should count the full HT size only once per node and not per instance. Line 193: perInstanceDataBytes /= fragment_.getNumInstances(queryOptions.getMt_dop()); getNumInstances() could return -1 http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/util/BitUtil.java File fe/src/main/java/org/apache/impala/util/BitUtil.java: Line 1: package org.apache.impala.util; Apache header http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: Line 409: // vary the spillable buffer size.. trailing "." Line 415: PlanNode.minSpillableBufferBytes_ = 64 * 1024; Would it make sense to move this into RuntimeEnv. Setting a static variable in PlanNode seems a little weird. -- To view, visit http://gerrit.cloudera.org:8080/6963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
