Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19714#discussion_r153677668
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1492,6 +1492,61 @@ that these options will be deprecated in future 
release as more optimizations ar
       </tr>
     </table>
     
    +## Broadcast Hint for SQL Queries
    +
    +Broadcast hint is a way for users to manually annotate a query and suggest 
to the query optimizer the join method. 
    +It is very useful when the query optimizer cannot make optimal decision 
with respect to join methods 
    +due to conservativeness or the lack of proper statistics. 
    +Spark Broadcast Hint has higher priority than autoBroadcastJoin mechanism, 
examples:
    +
    +<div class="codetabs">
    +
    +<div data-lang="scala"  markdown="1">
    +
    +{% highlight scala %}
    +val src = sql("SELECT * FROM src")
    +broadcast(src).join(recordsDF, Seq("key")).show()
    +{% endhighlight %}
    +
    +</div>
    +
    +<div data-lang="java"  markdown="1">
    +
    +{% highlight java %}
    +Dataset<Row> src = sql("SELECT * FROM src");
    +broadcast(src).join(recordsDF, Seq("key")).show();
    +{% endhighlight %}
    +
    +</div>
    +
    +<div data-lang="python"  markdown="1">
    +
    +{% highlight python %}
    +src = spark.sql("SELECT * FROM src")
    +recordsDF.join(broadcast(src), "key").show()
    +{% endhighlight %}
    +
    +</div>
    +
    +<div data-lang="r"  markdown="1">
    +
    +{% highlight r %}
    +src <- sql("SELECT COUNT(*) FROM src")
    +showDF(join(broadcast(src), recordsDF, src$key == recordsDF$key)))
    +{% endhighlight %}
    +
    +</div>
    +
    +<div data-lang="sql"  markdown="1">
    +
    +{% highlight sql %}
    +SELECT /*+ BROADCAST(r) */ * FROM records r JOIN src s ON r.key = s.key
    +{% endhighlight %}
    +
    +</div>
    +</div>
    +(Note that we accept `BROADCAST`, `BROADCASTJOIN` and `MAPJOIN` for 
broadcast hint)
    --- End diff --
    
    shall we treat it as a comment on the SQL example?
    ```
    --we accept BROADCAST, BROADCASTJOIN and MAPJOIN for broadcast hint
    SELECT /*+ BROADCAST(r) */ * FROM records r JOIN src s ON r.key = s.key
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to