[GitHub] spark issue #14817: [SPARK-17247][SQL]: when calcualting size of a relation ...

Parth-Brahmbhatt Thu, 01 Sep 2016 12:42:37 -0700

Github user Parth-Brahmbhatt commented on the issue:

    https://github.com/apache/spark/pull/14817
  
    @hvanhovell I looked at AlterTableRecoverPartitionsCommand and the 
parallelism in listing could help it will still cause huge perf penalty. We 
have tables with millions of partitions and we use s3 for storage where listing 
is more expansive.  I think it is much better to just stop listing once we know 
the stat used only for join optimization won't meet the threshold and I don't 
see the downside compared to what we currently offer.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14817: [SPARK-17247][SQL]: when calcualting size of a relation ...

Reply via email to