Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19714#discussion_r153679188
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
    @@ -91,10 +91,10 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
        * predicates can be evaluated by matching join keys. If found,  Join 
implementations are chosen
        * with the following precedence:
        *
    -   * - Broadcast: if one side of the join has an estimated physical size 
that is smaller than the
    -   *     user-configurable [[SQLConf.AUTO_BROADCASTJOIN_THRESHOLD]] 
threshold
    -   *     or if that side has an explicit broadcast hint (e.g. the user 
applied the
    -   *     [[org.apache.spark.sql.functions.broadcast()]] function to a 
DataFrame), then that side
    +   * - Broadcast: if one side of the join has an explicit broadcast hint 
(e.g. the user applied the
    --- End diff --
    
    ```
    Broadcast: We prefer to broadcast the join side with an explicit broadcast 
hint(e.g. the user applied the [[org.apache.spark.sql.functions.broadcast()]] 
function to a DataFrame). If both sides have the broadcast hint, we prefer to 
broadcast the side with a smaller estimated physical size. If neither one of 
the sides has the broadcast hint, we only broadcast the join side if its 
estimated physical size that is smaller than the user-configurable 
[[SQLConf.AUTO_BROADCASTJOIN_THRESHOLD]] threshold.
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to