GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20099
[SPARK-22916][SQL] shouldn't bias towards build right if user does not
specify
## What changes were proposed in this pull request?
When there are no broadcast hints, the current spark strategies will prefer
to build right, without considering the sizes of the two sides. This patch
added the logic to consider the sizes of the two tables for the build side. To
make the logic clear, the build side is determined by two steps:
1. If there are broadcast hints, the build side is determined by
`broadcastSideByHints`;
2. If there are no broadcast hints, the build side is determined by
`broadcastSideByConfig`;
3. If the broadcast is disabled by the config, it falls back to the next
cases.
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/liufengdb/spark fix-spark-strategies
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20099.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20099
----
commit e4b63f5fab81b7637d107efe6524b2f41c681a10
Author: Feng Liu <fengliu@...>
Date: 2017-12-27T23:22:30Z
init
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]