[
https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-5801.
------------------------------
Resolution: Fixed
Fix Version/s: 1.4.0
Issue resolved by pull request 4747
[https://github.com/apache/spark/pull/4747]
> Shuffle creates too many nested directories
> -------------------------------------------
>
> Key: SPARK-5801
> URL: https://issues.apache.org/jira/browse/SPARK-5801
> Project: Spark
> Issue Type: Bug
> Components: Shuffle, Spark Core
> Affects Versions: 1.2.1
> Reporter: Kay Ousterhout
> Priority: Critical
> Fix For: 1.4.0
>
>
> When running Spark on EC2, there are 4 nested shuffle directories before the
> hashed directory names, for example:
> /mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/spark-675133f0-b2c8-44a1-8775-5e394674609b/spark-69c1ea15-4e7f-454a-9f57-19763c7bdd17/spark-b036335c-60fa-48ab-a346-f1b420af2027/0c
> My understanding is that this should look like:
> /mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/0c
> This happened when I was using the sort-based shuffle (all default
> configurations for Spark on EC2).
> This is not a correctness problem (the shuffle still works fine).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]