Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19714#discussion_r153591249
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1492,6 +1492,61 @@ that these options will be deprecated in future 
release as more optimizations ar
       </tr>
     </table>
     
    +## Broadcast Hint for SQL Queries
    +
    +Broadcast hint is a way for users to manually annotate a query and suggest 
to the query optimizer the join method. 
    +It is very useful when the query optimizer cannot make optimal decision 
with respect to join methods 
    +due to conservativeness or the lack of proper statistics. 
    +Spark Broadcast Hint has higher priority than autoBroadcastJoin mechanism, 
examples:
    --- End diff --
    
    > The `BROADCAST` hint guides Spark to broadcast each specified table when 
joining them with another table or view. When Spark deciding the join methods, 
the broadcast hash join (i.e., BHJ) is preferred, even if the statistics is 
above the configuration `spark.sql.autoBroadcastJoinThreshold`. When both sides 
of a join are specified, Spark broadcasts the one having the lower statistics. 
Note Spark does not guaranttee BHJ is always chosen, since not all cases (e.g. 
full outer join) support BHJ. When the broadcast nested loop join is selected, 
we still respect the hint. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to