[GitHub] [spark] squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-521687011 thanks for the looking into this @vanzin and @cloud-fan . I agree its unnecessarily confusing, would be good to make this consistent. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-520501415 Thanks for taking another look. Another thing to keep in mind w/ speculative shuffle tasks -- the scheduler never puts speculative tasks on the same host as the original task. That ensures the speculative task isn't trying to write to the same disk with local shuffle storage. But, that doesn't help at all with distributed shuffle storage (and doesn't help deal w/ zombie tasks etc.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-520482491 Sorry I don't understand -- why don't you want to support speculative execution? Correctness before performance, yes. On the assumptions about task independence -- yes, that *was* the assumption, before this whole thread of issues related to non-determinstic tasks and stage retry. Fetch failures are more likely on large clusters & large workloads, precisely where speculative execution is important too. If we couldn't get it to work together, then I would totally agree we should go for correctness. But I think using the global TID would give us the behavior we want. I also think the global TID is simpler. For one, debugging is simpler -- the shuffle id is actually the shuffle id; there isn't some other state that is tracked separately to know which block to get. If you're really just concerned about the overhead of an additional field in the shuffle block, I think you could even swap out the original map partition id for the TID of the map task (though that would be more complex in other ways, the `MapStatus` would need to track the TID since its position in the array would no longer be sufficient). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-519191919 > BTW, another way to fix this problem is: always include the task id (not task attempt id) in the shuffle block id. This works, with a larger overhead: yes, actually this is exactly what I meant. I think we're actually talking about the same id -- unfortunately the naming here within spark isn't great. This is `TaskContext.taskAttemptId` : https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TaskContext.scala#L168-L172 aka TID. This what the new shuffle api is calling (you are probably thinking of `TaskContext.attemptNumber()` : https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TaskContext.scala#L162-L166 which I know is sometimes referred to as task attempt as well, its very confusing) As we both said, using that has larger overhead, as you've got to put it in every shuffle block. But I think it might be necessary for properly rolling back w/ speculative tasks, and I think its also necessary with a centralized shuffle store (@yifeih do you remember details here better? I will need to look through the past discussions ...), where you may have multiple tasks for the same logical task, perhaps from task retries or speculative attempts, perhaps from stage retries, and you need to clearly know which one to use downstream. in the scheme of things, that extra overhead seems pretty minimal. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
squito commented on issue #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#issuecomment-518740629 > The task attempt id is not enough in this situation, because for the indeterminate stage, while a single task fails, we need a whole stage rerun, only retry the failure task will cause correctness bug yes, but the scheduler is responsible for rerunning all those tasks, and then tracking shuffle files, right? (admittedly I'm still trying to understand the rest of this pr ...) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org