Kay Ousterhout created SPARK-5801:
-------------------------------------

             Summary: Shuffle creates too many nested directories
                 Key: SPARK-5801
                 URL: https://issues.apache.org/jira/browse/SPARK-5801
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.2.1
            Reporter: Kay Ousterhout


When running Spark on EC2, there are 4 nested shuffle directories before the 
hashed directory names, for example:

/mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/spark-675133f0-b2c8-44a1-8775-5e394674609b/spark-69c1ea15-4e7f-454a-9f57-19763c7bdd17/spark-b036335c-60fa-48ab-a346-f1b420af2027/0c

My understanding is that this should look like:

/mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/0c

This happened when I was using the sort-based shuffle (all default 
configurations for Spark on EC2).

This is not a correctness problem (the shuffle still works fine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to