[
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-17247.
----------------------------------
Resolution: Incomplete
> when fall back to hdfs is enabled for stats calculation, the hdfs listing and
> size calcuation should be terminated as soon as total size > broadcast
> threshold
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-17247
> URL: https://issues.apache.org/jira/browse/SPARK-17247
> Project: Spark
> Issue Type: Improvement
> Reporter: Parth Brahmbhatt
> Priority: Minor
> Labels: bulk-closed
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats
> are available from metastore we fall back to hdfs. This is useful join
> optimization however this can slow things down. To speed up the operation we
> could stop size calculation as soon as we hit the broadcast threshold as the
> accuracy of size is not important.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]