Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16864
Change subject: IMPALA-10287: Include parallelism in cost comparison of broadcast vs partition ...................................................................... IMPALA-10287: Include parallelism in cost comparison of broadcast vs partition The current planner tends to pick broadcast distribution in some cases even when partition distribution would be more optimal (seen in TPC-DS performance runs). This patch adds 2 query options: - use_dop_for_costing (type:boolean, default:false) - broadcast_to_partition_factor (type:double, default:1.0) By default, they don't alter the current behavior of the planner. If use_dop_for_costing is enabled, the distributed planner will increase the cost of the broadcast join's build side by C.ln(m) where m = degree of parallelism of the join node and, C = the broadcast_to_partition_factor This allows the planner to more favorably consider partition distribution where appropriate. The choice of natural log in the calculation is not a final choice at this point but is intended to model a non-linear relationship between mt_dop and the query performance. After further performance testing with tuning the above factor, we can establish a better correlation and refine the formula. Testing: - Added a new test file with TPC-DS Q78 which shows partition distribution for a left-outer join in the query when the query options are enabled (it chooses broadcast otherwise). Change-Id: Idff569299e5c78720ca17c616a531adac78208e1 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/tpcds-dist-method.test 7 files changed, 603 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/16864/1 -- To view, visit http://gerrit.cloudera.org:8080/16864 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idff569299e5c78720ca17c616a531adac78208e1 Gerrit-Change-Number: 16864 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha <[email protected]>
