[GitHub] [spark] cloud-fan commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files

GitBox Wed, 07 Aug 2019 07:33:49 -0700

cloud-fan commented on issue #24892: [SPARK-25341][Core] Support rolling back a 
shuffle map stage and re-generate the shuffle files
URL: https://github.com/apache/spark/pull/24892#issuecomment-519124459
 
 
   BTW, another way to fix this problem is: always include the task id (not 
task attempt id) in the shuffle block id. This works, with a larger overhead:
   1. the `MapOutputTracker` needs to track the task id per shuffle block, 
instead of a shuffle generation id per shuffle.
   2. when the shuffle reader fetching blocks of one shuffle, it needs to 
include one task id per shuffle block in the network request.
   3. even if there is no indeterminate stage, the overhead is still there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files

Reply via email to