Ngone51 commented on a change in pull request #29452:
URL: https://github.com/apache/spark/pull/29452#discussion_r476964708
##########
File path:
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
##########
@@ -2077,6 +2086,98 @@ class TaskSetManagerSuite
assert(manager.resourceOffer("exec1", "host1", ANY)._1.isEmpty)
}
+ test("SPARK-21040: Tasks are not speculated on decommissioning if
speculation is disabled") {
Review comment:
Thank you @agrawaldevesh for your detailed explanation.
> In effect it is testing that TaskSetManager#checkSpeculatableTasks is
returning false when speculation is disabled
IIUC, `TaskSetManager#checkSpeculatableTasks` won't be called when
speculation is disabled:
https://github.com/apache/spark/blob/8b4862953a879a9b3ba6f57e669efc383df68b7c/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L212-L217
So I think the test might be testing an impossible case.
> when a decommissioned executor is finally lost, its tasks are rerun and
not anytime before that.
I don't have a strong opinion about this. It's good in terms of integration
test. But I'd like to say that I think this's actually the original behavior of
executor lose and we should already have some tests cover it. e.g., an executor
has been already lost but the driver only gets to know it when the heartbeat
timeout. During this time, `TaskSetManager` also doesn't rerun tasks. IIUC,
decommission is built on top of this kind of basis, so it ideally shouldn't
change the original behavior indeed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]