Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16864


Change subject: IMPALA-10287: Include parallelism in cost comparison of 
broadcast vs partition
......................................................................

IMPALA-10287: Include parallelism in cost comparison of broadcast vs partition

The current planner tends to pick broadcast distribution in some cases
even when partition distribution would be more optimal (seen in
TPC-DS performance runs).

This patch adds 2 query options:
 - use_dop_for_costing (type:boolean, default:false)
 - broadcast_to_partition_factor (type:double, default:1.0)
By default, they don't alter the current behavior of the planner. If
use_dop_for_costing is enabled, the distributed planner will increase
the cost of the broadcast join's build side by C.ln(m) where
m = degree of parallelism of the join node and,
C = the broadcast_to_partition_factor
This allows the planner to more favorably consider partition distribution
where appropriate.

The choice of natural log in the calculation is not a final choice
at this point but is intended to model a non-linear relationship
between mt_dop and the query performance. After further performance
testing with tuning the above factor, we can establish a better
correlation and refine the formula.

Testing:
 - Added a new test file with TPC-DS Q78 which shows partition
   distribution for a left-outer join in the query when the
   query options are enabled (it chooses broadcast otherwise).

Change-Id: Idff569299e5c78720ca17c616a531adac78208e1
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-dist-method.test
7 files changed, 603 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/16864/1
--
To view, visit http://gerrit.cloudera.org:8080/16864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idff569299e5c78720ca17c616a531adac78208e1
Gerrit-Change-Number: 16864
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <[email protected]>

Reply via email to