squito commented on a change in pull request #22806: [SPARK-25250][CORE] : Late
zombie task completions handled correctly even before new taskset launched
URL: https://github.com/apache/spark/pull/22806#discussion_r250359463
##########
File path:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
##########
@@ -2851,6 +2862,40 @@ class DAGSchedulerSuite extends SparkFunSuite with
LocalSparkContext with TimeLi
}
}
+ test("SPARK-25250: Late zombie task completions handled correctly even
before" +
+ " new taskset launched") {
+ val shuffleMapRdd = new MyRDD(sc, 4, Nil)
+ val shuffleDep = new ShuffleDependency(shuffleMapRdd, new
HashPartitioner(4))
+ val reduceRdd = new MyRDD(sc, 4, List(shuffleDep), tracker =
mapOutputTracker)
+ submit(reduceRdd, Array(0, 1, 2, 3))
+
+ completeShuffleMapStageSuccessfully(0, 0, numShufflePartitions = 4)
+
+ // Fail Stage 1 Attempt 0 with Fetch Failure
+ runEvent(makeCompletionEvent(
+ taskSets(1).tasks(0),
+ FetchFailed(makeBlockManagerId("hostA"), shuffleDep.shuffleId, 0, 0,
"ignored"),
+ null))
+
+ // this will trigger a resubmission of stage 0, since we've lost some of
its
+ // map output, for the next iteration through the loop
+ scheduler.resubmitFailedStages()
+ completeShuffleMapStageSuccessfully(0, 1, numShufflePartitions = 4)
+
+ runEvent(makeCompletionEvent(
+ taskSets(1).tasks(3), Success, Nil, Nil))
+ assert(completedPartitions.get(taskSets(3).stageId).get.contains(
+ taskSets(3).tasks(1).partitionId) == false, "Corresponding partition id
for" +
+ " stage 1 attempt 1 is not complete yet")
Review comment:
a better check would be to make sure you have exactly the right set of
completed partitions. I think it would also good to add some more comments and
check just on test setup, something like:
```scala
// tasksets 1 & 3 should be two different attempts for our reduce stage --
lets double-check test setup
val reduceStage = taskSets(1).stageId
assert(taskSets(3).stageId === reduceStage)
// complete one task from the original taskset, make sure we update the
taskSchedulerImpl so it can notify
// all taskSetManagers. Some of that is mocked here, just check there is
the right event.
val taskToComplete = taskSets(1).tasks(3)
runEvent(makeCompletionEvent(taskToComplete, Success, Nil, Nil))
assert(completedPartitions.getOrElse(reduceStage, Set()) ===
Set(taskToComplete.partitionId))
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]