[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

cloud-fan Tue, 04 Sep 2018 18:20:01 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    @tgravescs yes you are right about the problem here. Instead of asking 
executors to remove old committed shuffle data, I prefer #6648 , which just 
write new shuffle data with a different file name(putting stage attempt id in 
the shuffle file name). The reducers will ask the driver to get the latest 
shuffle status(the stage attempt id) and fetch the latest shuffle data.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to