[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

tgravescs Tue, 04 Sep 2018 07:06:41 -0700

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    To clarify your last few comments, I think you are saying if you were to 
fail all the reduce tasks, the shuffle write data is still there and doesn't 
get removed and since first write wins on rerun it might still use the older 
already shuffled data?
    
    So in order to fix that we would need a way to tell the executors to remove 
that older committed shuffle data



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to