akirillov commented on issue #20640: [SPARK-19755][Mesos] Blacklist is always 
active for MesosCoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/20640#issuecomment-526274978
 
 
   Thanks, @IgorBerman. Based on the conversation it looks like the more 
general idea is to use logic similar to the one in `BlacklistTracker` but for 
Mesos Task failures. Mesos task launch failure can be caused by multiple 
reasons including `TASK_ERROR` due to lack of permissions (not node-specific), 
`TASK_KILLED` due to over-commitment or the upcoming node draining Mesos 
feature. So it doesn't seem that `BlacklistTracker` can be used for this 
purpose and another implementation is needed. 
   
   Speaking more generally, if there's a failed node or a network failure, the 
scheduler will not receive offers from that node and won't attempt to launch a 
task(executor) on it. Also, given that a coarse-grained scheduler is the 
default one, and the fine-grained scheduler is deprecated, the scheduling 
happens only on application start (except dynamic allocation use case). So 
given the nature and duration of the scheduling step, it's not clear if the 
blacklisting makes sense for the scheduling of executors themselves.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to