[ 
https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865688#comment-15865688
 ] 

Apache Spark commented on SPARK-16475:
--------------------------------------

User 'rxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/16925

> Broadcast Hint for SQL Queries
> ------------------------------
>
>                 Key: SPARK-16475
>                 URL: https://issues.apache.org/jira/browse/SPARK-16475
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Reynold Xin
>              Labels: releasenotes
>         Attachments: BroadcastHintinSparkSQL.pdf
>
>
> Broadcast hint is a way for users to manually annotate a query and suggest to 
> the query optimizer the join method. It is very useful when the query 
> optimizer cannot make optimal decision with respect to join methods due to 
> conservativeness or the lack of proper statistics.
> The DataFrame API has broadcast hint since Spark 1.5. However, we do not have 
> an equivalent functionality in SQL queries. We propose adding Hive-style 
> broadcast hint to Spark SQL.
> For more information, please see the attached document. One note about the 
> doc: in addition to supporting "MAPJOIN", we should also support 
> "BROADCASTJOIN" and "BROADCAST" in the comment, e.g. the following should be 
> accepted:
> {code}
> SELECT /*+ MAPJOIN(b) */ ...
> SELECT /*+ BROADCASTJOIN(b) */ ...
> SELECT /*+ BROADCAST(b) */ ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to