[Impala-CR](cdh5-trunk) MT: Planner for multi-threaded execution

Marcel Kornacker (Code Review) Fri, 29 Apr 2016 13:31:10 -0700

Marcel Kornacker has posted comments on this change.

Change subject: MT: Planner for multi-threaded execution
......................................................................



Patch Set 1:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/2846/1/be/src/service/query-options.cc
File be/src/service/query-options.cc:

Line 391:       case TImpalaQueryOptions::MT_NUM_CORES: {
> How about NUM_THREADS? Reasons:
i want to prefix all multi-threading related query options with something, and 
i thought mt_ works just fine. num_threads isn't that far from 
num_scanner_threads.

also, num_threads isn't accurate: there is going to be a variable number of 
threads per fragment instance (max is num_cores), but the total number of 
threads for a particular query executing on a single node isn't really bounded.

i think v_cpu_cores will eventually go away. also, the mt_ prefix should be 
enough to disambiguate.


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/common/TreeNode.java
File fe/src/main/java/com/cloudera/impala/common/TreeNode.java:

Line 27:   protected TreeNode() {}
> if this is to prevent public construction, add comment
Done


Line 41:   public void removeChildren() { children_.clear(); }
> clearChildren() seems clearer
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/CohortId.java
File fe/src/main/java/com/cloudera/impala/planner/CohortId.java:

Line 1: // Copyright 2012 Cloudera Inc.
> 2016 and elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 617:       unionNode.removeChildren();
> can be moved above this block (needs to be done in any case)
Done


Line 622:       // TODO: provide a more rigid interface for this, this feels 
error-prone.
> are we really going to address this TODO? if it's easy enough might as well
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java
File fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java:

Line 59:   public HashTableId getHashTableId() { return hashTableId_; }
> Use JoinBuildId and move to JoinNode?
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/HashTableBuildNode.java
File fe/src/main/java/com/cloudera/impala/planner/HashTableBuildNode.java:

Line 38: public class HashTableBuildNode extends PlanNode {
> Why is this a PlanNode and not a DataSink?  Seems more logical since a join
good idea.


Line 45:    * Moves the build input of joinNode to this node (and connects this 
node
> remove parenthesis, that part seems important
Done


Line 69:     super.init(analyzer);
> set cardinality and hosts, Preconditions check that there are no conjuncts
now obsolete


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/JoinNode.java
File fe/src/main/java/com/cloudera/impala/planner/JoinNode.java:

Line 64:   // TODO: change to a more generic abstraction, NLJs don't have hash 
tables
> This TODO sounds easy enough to address now (just rename the new stuff)
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java:

Line 34:  * The plans that produce intermediate hash tables for a particular 
plan P are grouped
> no need for the "P" (it's not used anywhere else)
Done


Line 51:    * Given a distributed plan (expressed as the root of a tree of plan 
fragments),
> suggest removing the part in parenthesis and renaming the "plan" param to "
Done


Line 73:     LOG.info("createbuildplans fragment " + 
fragment.getId().toString());
> let's not forget to eventually remove some of this logging
i was going to remove it when the first end-to-end tests run


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/PlanFragment.java
File fe/src/main/java/com/cloudera/impala/planner/PlanFragment.java:

Line 78:   private CohortId cohortId_;
> needs comment or explanation in class comment
Done


Line 111:     LOG.info("new fragment " + id.toString());
> remove
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/Planner.java
File fe/src/main/java/com/cloudera/impala/planner/Planner.java:

Line 172:     return planner.createPlans(distrPlan.get(0));
> squeeze this in here:
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java
File fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java:

Line 402:     testPlan(testCase,  Section.DISTRIBUTEDPLAN, queryCtx, errorLog, 
actualOutput);
> extra space
Done


Line 404:     if 
(!testCase.getSectionContents(Section.PARALLELPLANS).isEmpty()) {
> The getSectionContents() check is already performed in testPlan() below.
Done


http://gerrit.cloudera.org:8080/#/c/2846/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test:

Line 334: |  |  plan id: 01
> I'm not sure the plan id is really necessary. I'm ok leaving it but I'd sug
regarding plan ids: i decided not to do that, because those only change at the 
boundaries of plans/fragment trees, and i didn't want to add more repetitive 
information


Line 351: |  |--32:BUILD [01]
> How about JOIN BUILD? That seems meaningful and generic enough to also cove
Done


Line 356: |  |  |  tuple-ids=4 row-size=0B cardinality=unavailable
> even though it's not needed, it would be nice to preserve the cardinality a
now obsolete


Line 5105: |  |  |  build expressions: 
> whitespace
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c34dd3f9190a131e6f03d901b4bfcd164a5174
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) MT: Planner for multi-threaded execution

Reply via email to