ekoifman commented on a change in pull request #34464:
URL: https://github.com/apache/spark/pull/34464#discussion_r775074373



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala
##########
@@ -69,16 +77,16 @@ object DynamicJoinSelection extends Rule[LogicalPlan] {
   }
 
   def apply(plan: LogicalPlan): LogicalPlan = plan.transformDown {
-    case j @ ExtractEquiJoinKeys(_, _, _, _, _, left, right, hint) =>
+    case j @ ExtractEquiJoinKeys(joinType, _, _, _, _, left, right, hint) =>
       var newHint = hint
       if (!hint.leftHint.exists(_.strategy.isDefined)) {
-        selectJoinStrategy(left).foreach { strategy =>
+        selectJoinStrategy(left, joinType).foreach { strategy =>

Review comment:
       For LOJ with many empty partitions on the left, the local join can 
short-circuit whether you broadcast or shuffle.  I'm not sure how to determine 
which strategy will send less data around.  Is there another heuristic that can 
be used?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to