[
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-17247:
------------------------------
Priority: Minor (was: Major)
Issue Type: Improvement (was: Bug)
> when fall back to hdfs is enabled for stats calculation, the hdfs listing and
> size calcuation should be terminated as soon as total size > broadcast
> threshold
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-17247
> URL: https://issues.apache.org/jira/browse/SPARK-17247
> Project: Spark
> Issue Type: Improvement
> Reporter: Parth Brahmbhatt
> Priority: Minor
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats
> are available from metastore we fall back to hdfs. This is useful join
> optimization however this can slow things down. To speed up the operation we
> could stop size calculation as soon as we hit the broadcast threshold as the
> accuracy of size is not important.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]