Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19714#discussion_r153680715
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -153,6 +151,27 @@ abstract class SparkStrategies extends
QueryPlanner[SparkPlan] {
// --- BroadcastHashJoin
--------------------------------------------------------------------
+ case ExtractEquiJoinKeys(joinType, leftKeys, rightKeys, condition,
left, right)
+ if canBuildRight(joinType) && canBuildLeft(joinType)
+ && left.stats.hints.broadcast && right.stats.hints.broadcast =>
--- End diff --
I think we can create new methods for it
```
def shouldBuildLeft(joinType: JoinType, left: LogicalPlan, right:
LogicalPlan): Boolean {
if (left.stats.hints.broadcast) {
if (canBuildRight(joinType) && right.stats.hints.broadcast) {
// if both sides have broadcast hint, only broadcast left side if its
estimated pyhsical size is smaller than right side
left.stats.sizeInBytes <= right.stats.sizeInBytes
} else {
// if only left side has the broadcast hint, broadcast the left side.
true
}
} else {
if (canBuildRight(joinType) && right.stats.hints.broadcast) {
// if only right side has the broadcast hint, do not broadcast the
left side.
false
} else {
// if neither one of the sides has broadcast hint, only broadcast
the left side if its estimated physical size is smaller than the treshold and
smaller than right side.
canBroadcast(left) && left.stats.sizeInBytes <=
right.stats.sizeInBytes
}
}
}
def shouldBuildRight...
```
and use it like
```
case ExtractEquiJoinKeys(joinType, leftKeys, rightKeys, condition, left,
right)
if canBuildRight(joinType) && shouldBuildRight(joinType, left, right)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]