[ 
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-17247:
------------------------------
      Priority: Minor  (was: Major)
    Issue Type: Improvement  (was: Bug)

> when fall back to hdfs is enabled for stats calculation, the hdfs listing and 
> size calcuation should be terminated as soon as total size > broadcast 
> threshold
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17247
>                 URL: https://issues.apache.org/jira/browse/SPARK-17247
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Parth Brahmbhatt
>            Priority: Minor
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats 
> are available from metastore we fall back to hdfs. This is useful join 
> optimization however this can slow things down. To speed up the operation we 
> could stop size calculation as soon as we hit the broadcast threshold as the 
> accuracy of size is not important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to