[
https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-16475.
---------------------------------
Resolution: Fixed
Fix Version/s: 2.2.0
Issue resolved by pull request 16925
[https://github.com/apache/spark/pull/16925]
> Broadcast Hint for SQL Queries
> ------------------------------
>
> Key: SPARK-16475
> URL: https://issues.apache.org/jira/browse/SPARK-16475
> Project: Spark
> Issue Type: Improvement
> Reporter: Reynold Xin
> Labels: releasenotes
> Fix For: 2.2.0
>
> Attachments: BroadcastHintinSparkSQL.pdf
>
>
> Broadcast hint is a way for users to manually annotate a query and suggest to
> the query optimizer the join method. It is very useful when the query
> optimizer cannot make optimal decision with respect to join methods due to
> conservativeness or the lack of proper statistics.
> The DataFrame API has broadcast hint since Spark 1.5. However, we do not have
> an equivalent functionality in SQL queries. We propose adding Hive-style
> broadcast hint to Spark SQL.
> For more information, please see the attached document. One note about the
> doc: in addition to supporting "MAPJOIN", we should also support
> "BROADCASTJOIN" and "BROADCAST" in the comment, e.g. the following should be
> accepted:
> {code}
> SELECT /*+ MAPJOIN(b) */ ...
> SELECT /*+ BROADCASTJOIN(b) */ ...
> SELECT /*+ BROADCAST(b) */ ...
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]