cloud-fan commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-519042612 @squito the problem we need to solve is 1. Spark may need to re-generate some shuffle blocks. If the shuffle map stage is indeterminate, we need to rerun all the tasks. 2. When Spark is re-running an entire shuffle map stage, it's possible that some tasks fail and we need to re-run it again. 3. the task scheduler won't kill tasks even if the stage attempt is aborted. This means, we may have tasks of different stage attempt id that write to the same shuffle file. To solve this problem, `mapTaskAttemptId` is not enough, we need stage attempt id. Furthermore, `mapTaskAttemptId` is not needed because Spark always re-runs all the tasks of an indeterminate shuffle map stage. This PR uses `shuffleGeneratorId` to represent stage attempt id, to make the concept more general in the shuffle side.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org