Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5160: adjust spill buffer size based on planner estimates
......................................................................


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

Line 314:         // default buffer size, e.g. with small dimension tables.
> That's correct. Ultimately the planner can only guess because we do the sch
Filed IMPALA-5565. I'm still a little hazy on the exact scope of what needs to 
be fixed so it might be helpful to add any ideas you had in there.


http://gerrit.cloudera.org:8080/#/c/6963/6/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

Line 186:     if (getChild(1).getCardinality() == -1 || 
getChild(1).getAvgRowSize() == -1
> Agree
Done. Made it check for <= 0 since there are a couple of bugs in the numNodes 
calculation that I fixed in https://gerrit.cloudera.org/#/c/7223/ where it 
could return 0.


Line 190:     } else {
> Add a Preconditions check that asserts MT_DOP=0 with a comment that the est
We still need to generate plans in planner tests, so I made the DCHECK 
conditional on whether it's a test environment.


http://gerrit.cloudera.org:8080/#/c/6963/8/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

Line 629:   protected final static long getDefaultSpillableBufferBytes() {
> Why this indirection?
To make the intent clear at the callsites. The HDFS read size and the spillable 
buffer size will be decoupled in IMPALA-3200. It will be easier to fix that up 
if we make the distinction in the planner already.


http://gerrit.cloudera.org:8080/#/c/6963/8/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test:

Line 2: select *
> use straight_join for these tests
Done


Line 761: select string_col, count(*)
> Add a similar test with no stats for the join case (or integrate into this 
I have a test above " Join with no stats for right input - should use default 
buffers." Or did you have an additional test in mind?


-- 
To view, visit http://gerrit.cloudera.org:8080/6963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to