Marcel Kornacker has posted comments on this change. Change subject: MT: Planner for multi-threaded execution ......................................................................
Patch Set 1: (23 comments) http://gerrit.cloudera.org:8080/#/c/2846/1/be/src/service/query-options.cc File be/src/service/query-options.cc: Line 391: case TImpalaQueryOptions::MT_NUM_CORES: { > How about NUM_THREADS? Reasons: i want to prefix all multi-threading related query options with something, and i thought mt_ works just fine. num_threads isn't that far from num_scanner_threads. also, num_threads isn't accurate: there is going to be a variable number of threads per fragment instance (max is num_cores), but the total number of threads for a particular query executing on a single node isn't really bounded. i think v_cpu_cores will eventually go away. also, the mt_ prefix should be enough to disambiguate. http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/common/TreeNode.java File fe/src/main/java/com/cloudera/impala/common/TreeNode.java: Line 27: protected TreeNode() {} > if this is to prevent public construction, add comment Done Line 41: public void removeChildren() { children_.clear(); } > clearChildren() seems clearer Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/CohortId.java File fe/src/main/java/com/cloudera/impala/planner/CohortId.java: Line 1: // Copyright 2012 Cloudera Inc. > 2016 and elsewhere Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java: Line 617: unionNode.removeChildren(); > can be moved above this block (needs to be done in any case) Done Line 622: // TODO: provide a more rigid interface for this, this feels error-prone. > are we really going to address this TODO? if it's easy enough might as well Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java File fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java: Line 59: public HashTableId getHashTableId() { return hashTableId_; } > Use JoinBuildId and move to JoinNode? Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/HashTableBuildNode.java File fe/src/main/java/com/cloudera/impala/planner/HashTableBuildNode.java: Line 38: public class HashTableBuildNode extends PlanNode { > Why is this a PlanNode and not a DataSink? Seems more logical since a join good idea. Line 45: * Moves the build input of joinNode to this node (and connects this node > remove parenthesis, that part seems important Done Line 69: super.init(analyzer); > set cardinality and hosts, Preconditions check that there are no conjuncts now obsolete http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/JoinNode.java File fe/src/main/java/com/cloudera/impala/planner/JoinNode.java: Line 64: // TODO: change to a more generic abstraction, NLJs don't have hash tables > This TODO sounds easy enough to address now (just rename the new stuff) Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java File fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java: Line 34: * The plans that produce intermediate hash tables for a particular plan P are grouped > no need for the "P" (it's not used anywhere else) Done Line 51: * Given a distributed plan (expressed as the root of a tree of plan fragments), > suggest removing the part in parenthesis and renaming the "plan" param to " Done Line 73: LOG.info("createbuildplans fragment " + fragment.getId().toString()); > let's not forget to eventually remove some of this logging i was going to remove it when the first end-to-end tests run http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/PlanFragment.java File fe/src/main/java/com/cloudera/impala/planner/PlanFragment.java: Line 78: private CohortId cohortId_; > needs comment or explanation in class comment Done Line 111: LOG.info("new fragment " + id.toString()); > remove Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/main/java/com/cloudera/impala/planner/Planner.java File fe/src/main/java/com/cloudera/impala/planner/Planner.java: Line 172: return planner.createPlans(distrPlan.get(0)); > squeeze this in here: Done http://gerrit.cloudera.org:8080/#/c/2846/1/fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java File fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java: Line 402: testPlan(testCase, Section.DISTRIBUTEDPLAN, queryCtx, errorLog, actualOutput); > extra space Done Line 404: if (!testCase.getSectionContents(Section.PARALLELPLANS).isEmpty()) { > The getSectionContents() check is already performed in testPlan() below. Done http://gerrit.cloudera.org:8080/#/c/2846/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test: Line 334: | | plan id: 01 > I'm not sure the plan id is really necessary. I'm ok leaving it but I'd sug regarding plan ids: i decided not to do that, because those only change at the boundaries of plans/fragment trees, and i didn't want to add more repetitive information Line 351: | |--32:BUILD [01] > How about JOIN BUILD? That seems meaningful and generic enough to also cove Done Line 356: | | | tuple-ids=4 row-size=0B cardinality=unavailable > even though it's not needed, it would be nice to preserve the cardinality a now obsolete Line 5105: | | | build expressions: > whitespace Done -- To view, visit http://gerrit.cloudera.org:8080/2846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3c34dd3f9190a131e6f03d901b4bfcd164a5174 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-HasComments: Yes
