Github user squito commented on the issue:

    https://github.com/apache/spark/pull/20640
  
    thanks @IgorBerman, description looks fine to me now, maybe I saw it wrong 
before.
    
    your test sounds pretty good to me ... you could turn on debug logging for 
MesosCoarseGrainedSchedulerBackend and look for these log msgs:
    
    
https://github.com/apache/spark/blob/master/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L603
    
    What do you mean "it didn't have much effect" -- sounds like it did exactly 
the right thing?
    
    Sorry, I don't really understand description of the other bug you 
mentioned.  Why shouldn't it start a 2nd executor on the same slave for the 
same application?  That seems fine until you have enough failures for the node 
blacklisting to take effect.  There is also a small race (that is relatively 
benign) that would allow you to get a an executor on a node which you are in 
the middle of blacklisting.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to