[ 
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-17247:
------------------------------------

    Assignee: Apache Spark

> when fall back to hdfs is enabled for stats calculation, the hdfs listing and 
> size calcuation should be terminated as soon as total size > broadcast 
> threshold
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17247
>                 URL: https://issues.apache.org/jira/browse/SPARK-17247
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Parth Brahmbhatt
>            Assignee: Apache Spark
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats 
> are available from metastore we fall back to hdfs. This is useful join 
> optimization however this can slow things down. To speed up the operation we 
> could stop size calculation as soon as we hit the broadcast threshold as the 
> accuracy of size is not important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to