cloud-fan commented on issue #24892: [SPARK-25341][Core] Support rolling back a 
shuffle map stage and re-generate the shuffle files
URL: https://github.com/apache/spark/pull/24892#issuecomment-519042612
 
 
   @squito the problem we need to solve is
   1. Spark may need to re-generate some shuffle blocks. If the shuffle map 
stage is indeterminate, we need to rerun all the tasks.
   2. When Spark is re-running an entire shuffle map stage, it's possible that 
some tasks fail and we need to re-run it again.
   3. the task scheduler won't kill tasks even if the stage attempt is aborted. 
This means, we may have tasks of different stage attempt id that write to the 
same shuffle file.
   
   To solve this problem, `mapTaskAttemptId` is not enough, we need stage 
attempt id. Furthermore, `mapTaskAttemptId` is not needed because Spark always 
re-runs all the tasks of an indeterminate shuffle map stage. This PR uses 
`shuffleGeneratorId` to represent stage attempt id, to make the concept more 
general in the shuffle side.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to