[jira] [Assigned] (SPARK-17247) when fall back to hdfs is enabled for stats calculation, the hdfs listing and size calcuation should be terminated as soon as total size > broadcast threshold

2016-08-25 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-17247:


Assignee: Apache Spark

> when fall back to hdfs is enabled for stats calculation, the hdfs listing and 
> size calcuation should be terminated as soon as total size > broadcast 
> threshold
> --
>
> Key: SPARK-17247
> URL: https://issues.apache.org/jira/browse/SPARK-17247
> Project: Spark
>  Issue Type: Bug
>Reporter: Parth Brahmbhatt
>Assignee: Apache Spark
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats 
> are available from metastore we fall back to hdfs. This is useful join 
> optimization however this can slow things down. To speed up the operation we 
> could stop size calculation as soon as we hit the broadcast threshold as the 
> accuracy of size is not important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-17247) when fall back to hdfs is enabled for stats calculation, the hdfs listing and size calcuation should be terminated as soon as total size > broadcast threshold

2016-08-25 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-17247:


Assignee: (was: Apache Spark)

> when fall back to hdfs is enabled for stats calculation, the hdfs listing and 
> size calcuation should be terminated as soon as total size > broadcast 
> threshold
> --
>
> Key: SPARK-17247
> URL: https://issues.apache.org/jira/browse/SPARK-17247
> Project: Spark
>  Issue Type: Bug
>Reporter: Parth Brahmbhatt
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats 
> are available from metastore we fall back to hdfs. This is useful join 
> optimization however this can slow things down. To speed up the operation we 
> could stop size calculation as soon as we hit the broadcast threshold as the 
> accuracy of size is not important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org