Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/4098 to look at the new patch set (#4). Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash table build cost ...................................................................... IMPALA-2932: Extend DistributedPlanner to account for hash table build cost When deciding between a broadcast or repartition join, Impala calculates the cost of each join as the total amount of data that is sent over the network. This ignores some relevant costs, and can lead to bad plans. One such relevant cost is the work to create the hash table used in the join. This patch accounts for this by adding the amount of data inserted into the hash table (the size of the right side of the join) to the previous cost. This generally increases the estimated cost of broadcast joins relative to repartitioning joins, as the broadcast join must build the hash table on each node the data was broadcast to, so its effect will be to make repartitioning joins more likely to be chosen, especially in large clusters. This patch has not yet been performance tested. Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1 --- M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-planner/queries/PlannerTest/views.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test 8 files changed, 148 insertions(+), 119 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4098/4 -- To view, visit http://gerrit.cloudera.org:8080/4098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>