Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/20640
  
    @squito @IgorBerman a few points:
    a) This should be configurable. SPARK-16630 is not closed, it seems also 
kubernetes needs to support checking for blacklisted nodes when launching 
tasks. From what I see [here 
](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L172)
 though, the CoarseGrainedSchedulerBackend checks if the executor is acceptable 
when it registers back to the backend. So if it is launched on a black listed 
node it is removed immediately. Of course we want to fail fast and thus, in 
kubernetes case we should also have code to exclude the blacklisted nodes when 
launching pods. See 
[here](https://github.com/kubernetes/kubernetes/issues/14573) and 
[here](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/).
 This is essentially the same concept like in 
[yarn](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/resource-managers/yarn/src/main/
 scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala#L127-L133).
     
    b) I see TaskSchedulerImpl a higher level abstraction compared 
CoarseGrainedSchedulerBackend (this is subclassed by all major backends).  The 
thing is blacklist info is kept in TaskSchedulerImpl and all backends update it 
implicitly in different paths.  CoarseGrainedSchedulerBackend updates it when 
there is a new status update for 
[example](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L122).
 At the end of this path 
[handleFailedTask](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L772)
 is called which updates the related info.
    A more interesting path for the discussion is when an executor disconnects 
and this is where 
[Yarn](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala#L213)
 and 
[Kubernetes](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L428)
 override the driver's endpoint to set 
[criteria](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L356-L366)
 for when a executor is considered lost and task failure should be taken into 
consideration for blacklisting. The reason you want this is preemption, or for 
mesos ,besides that, could be a scenario where we have an agent restarting 
([process is restarte
 d](http://mesos.apache.org/documentation/latest/agent-recovery/)) and then 
executors are killed (framework checkpointing not enabled, eg. when upgrading). 
In the latter case if I restart my agent process I dont want to blacklist it, 
my slave is ready to go as soon as it reconnects to master. I also dont want to 
wait for blacklisting to expire. 
    
    To move forward, I suggest:
    a) remove the handling logic for blacklisting in mesos. Logging is done 
when a node is blacklisted 
[here](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala#L194).
  
    b) override logic at driver's point when there is a 
[need](http://mesos.apache.org/documentation/latest/oversubscription/) for 
doing so. 



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to