Alex Behm has posted comments on this change. Change subject: MT: Planner for multi-threaded execution ......................................................................
Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/2846/3/be/src/service/query-options.cc File be/src/service/query-options.cc: Line 398: "[0, 128].", value)); should either enforce the upper bound or not mention it since it's a user-visible msg http://gerrit.cloudera.org:8080/#/c/2846/3/common/thrift/DataSinks.thrift File common/thrift/DataSinks.thrift: Line 76: // only set for hash join build sinks optional then? http://gerrit.cloudera.org:8080/#/c/2846/3/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: Line 188: 44: optional i32 mt_num_cores = 1; remove trailing ";" for consistency http://gerrit.cloudera.org:8080/#/c/2846/3/common/thrift/Planner.thrift File common/thrift/Planner.thrift: Line 80: // A plan: tree of plan fragments that materializes either a query result or a hash table or a join build http://gerrit.cloudera.org:8080/#/c/2846/3/fe/src/main/java/com/cloudera/impala/common/TreeNode.java File fe/src/main/java/com/cloudera/impala/common/TreeNode.java: Line 40: public void clearChildren(NodeType n) { Looks like there was a misunderstanding in last CR. My intention was to have something like this (which imo is clearer): removeChild(NodeType n) { children_.remove(n); } and clearChildren() { children_.clear(); } http://gerrit.cloudera.org:8080/#/c/2846/3/fe/src/main/java/com/cloudera/impala/planner/JoinBuildSink.java File fe/src/main/java/com/cloudera/impala/planner/JoinBuildSink.java: Line 51: public JoinBuildSink(JoinTableId joinTableId, PlanNode joinNode) { seems clearer to pass in a JoinNode directly http://gerrit.cloudera.org:8080/#/c/2846/3/fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java File fe/src/main/java/com/cloudera/impala/planner/ParallelPlanner.java: Line 40: * HashTableBuildNode to build another hash table) JoinBuildSink etc. Line 181: PlanId parentPlanId = join.getFragment().getPlanId(); The id assignment might be a little easier to follow if we could make the existence of a cohort id in the parent fragment a precondition of this createBuildPlan() function. For example, we could directly assign a cohort id in createBuildPlans() if the fragment has join nodes. Then I think we can also get rid of the planCohorts_ map which is only used for id assignment purposes. http://gerrit.cloudera.org:8080/#/c/2846/3/fe/src/main/java/com/cloudera/impala/service/Frontend.java File fe/src/main/java/com/cloudera/impala/service/Frontend.java: Line 950: if (queryCtx.request.query_options.mt_num_cores != 1) { Also check RuntimeEnv.INSTANCE.isTestEnv(). Then this the MT option will just be ignored in non-test setups and we can ship this code. http://gerrit.cloudera.org:8080/#/c/2846/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test: Line 129: | hash table id: 00 join-table-id? Line 149: | tuple-ids=0 row-size=16B cardinality=1925 In the past, checking in tests at this level of explain detail has caused flakiness in our build and test runs, because some of these numbers (e.c., cardinality, num-hosts, and stats) tend to change "randomly" and cause failures. I think we need a solution before we can check this on or we risk destabilizing our builds, e.g., use the STANDARD explain level and include all info you care about in that. -- To view, visit http://gerrit.cloudera.org:8080/2846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3c34dd3f9190a131e6f03d901b4bfcd164a5174 Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-HasComments: Yes
