[ 
https://issues.apache.org/jira/browse/SPARK-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Brahmbhatt updated SPARK-15365:
-------------------------------------
    Description: Currently if a table is used in join operation we rely on 
Metastore returned size to calculate if we can convert the operation to 
Broadcast join. This optimization only kicks in for table's that have the 
statistics available in metastore. Hive generally rolls over to HDFS if the 
statistics are not available directly from metastore and this seems like a 
reasonable choice to adopt given the optimization benefit of using broadcast 
joins.  (was: Currently if a table is used in join operation we rely on 
Metastore returned size to calculate if we can convert the operation to 
Broadcast join. This optimization only kicks in for table's that have the 
statics available in metastore. Hive generally rolls over to HDFS if the 
statistics are not available directly from metastore and this seems like a 
reasonable choice to adopt given the optimization benefit of using broadcast 
joins.)

> Metastore relation should fallback to HDFS size if statistics are not 
> available from table meta data.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-15365
>                 URL: https://issues.apache.org/jira/browse/SPARK-15365
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Parth Brahmbhatt
>
> Currently if a table is used in join operation we rely on Metastore returned 
> size to calculate if we can convert the operation to Broadcast join. This 
> optimization only kicks in for table's that have the statistics available in 
> metastore. Hive generally rolls over to HDFS if the statistics are not 
> available directly from metastore and this seems like a reasonable choice to 
> adopt given the optimization benefit of using broadcast joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to