sandugood commented on issue #1648:
URL: 
https://github.com/apache/datafusion-ballista/issues/1648#issuecomment-4396595831

   In Spark's AQE implementation there is a `DynamicJoinSelection`, that is 
defined in: 
[](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala)
   
   It is all about operating on the LogicalPlan and creating hints.
   
   The logic (forgive my Scala skills, I don't use it quite often) is that:
   - `apply` is being called and it checks if a strategy was pre-defined by 
user (i.e `broadcast() `used on rhs of a join)
   - then there are two `bool`s, deciding whether to demote BroadcastHashJoin 
or not: `manyEmptyInPlan`, `manyEmptyInOther` and `canBroadcastPlan`
   - then it is all about branching and creating hints:
   ```
   if (demoteBroadcastHash && preferShuffleHash) {
     Some(SHUFFLE_HASH)
   } else if (demoteBroadcastHash) {
     Some(NO_BROADCAST_HASH)
   } else if (preferShuffleHash) {
     Some(PREFER_SHUFFLE_HASH)
   } else {
     None
   }
   ```
   
   After hints are injected, it's on `JoinSelection` to select the proper 
physical plan: 
[](https://github.com/apache/spark/blob/c26a127ba33137f36d55bf95cac71471e2a1704f/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L181)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to