Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/21976#discussion_r213056190
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -2474,19 +2478,21 @@ class DAGSchedulerSuite extends SparkFunSuite with
LocalSparkContext with TimeLi
runEvent(makeCompletionEvent(
taskSets(3).tasks(0), Success, makeMapStatus("hostB", 2)))
- // There should be no new attempt of stage submitted,
- // because task(stageId=1, stageAttempt=1, partitionId=1) is still
running in
- // the current attempt (and hasn't completed successfully in any
earlier attempts).
- assert(taskSets.size === 4)
+ // At this point there should be no active task set for stageId=1 and
we need
+ // to resubmit because the output from (stageId=1, stageAttemptId=0,
partitionId=1)
+ // was ignored due to executor failure
+ assert(taskSets.size === 5)
+ assert(taskSets(4).stageId === 1 && taskSets(4).stageAttemptId === 2
+ && taskSets(4).tasks.size === 1)
- // Complete task(stageId=1, stageAttempt=1, partitionId=1)
successfully.
+ // Complete task(stageId=1, stageAttempt=2, partitionId=1)
successfully.
runEvent(makeCompletionEvent(
- taskSets(3).tasks(1), Success, makeMapStatus("hostB", 2)))
+ taskSets(4).tasks(0), Success, makeMapStatus("hostB", 2)))
--- End diff --
yes it will, marking either of these successful will work, but the
assumption on line 2469 is that it got marked completed there by the
tasksetmanager. So we don't want to send success for taskSet(3).task(1) as it
should have already been marked success
Unfortunately you can't test the interactions in this unit test, that is
why I'm working on another scheduler integration test but was going to do that
under separate jira.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]