[GitHub] [spark] Ngone51 commented on a change in pull request #29452: [SPARK-32643][CORE][K8s] Consolidate state decommissioning in the TaskSchedulerImpl realm

GitBox Tue, 25 Aug 2020 18:48:27 -0700


Ngone51 commented on a change in pull request #29452:
URL: https://github.com/apache/spark/pull/29452#discussion_r476964708




##########
File path: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
##########
@@ -2077,6 +2086,98 @@ class TaskSetManagerSuite
     assert(manager.resourceOffer("exec1", "host1", ANY)._1.isEmpty)
   }
 
+  test("SPARK-21040: Tasks are not speculated on decommissioning if 
speculation is disabled") {

Review comment:
       Thank you @agrawaldevesh for your detailed explanation.  
   
   > In effect it is testing that TaskSetManager#checkSpeculatableTasks is 
returning false when speculation is disabled
   
   IIUC, `TaskSetManager#checkSpeculatableTasks` won't be called when 
speculation is disabled: 
   
   
https://github.com/apache/spark/blob/8b4862953a879a9b3ba6f57e669efc383df68b7c/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L212-L217
   
   So I think the test might be testing an impossible case.
   
   > when a decommissioned executor is finally lost, its tasks are rerun and 
not anytime before that.
   
   I don't have a strong opinion about this. It's good in terms of integration 
test. But I'd like to say that I think this's actually the original behavior of 
executor lose and we should already have some tests cover it. e.g., an executor 
has been already lost but the driver only gets to know it when the heartbeat 
timeout. During this time,  `TaskSetManager` also doesn't rerun tasks. IIUC, 
decommission is built on top of this kind of basis, so it ideally shouldn't 
change the original behavior indeed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on a change in pull request #29452: [SPARK-32643][CORE][K8s] Consolidate state decommissioning in the TaskSchedulerImpl realm

Reply via email to