Github user squito commented on the issue:
https://github.com/apache/spark/pull/17619
ok I think I understand. This sounds like the equivalent of some of the
existing blacklisting behavior which current only exists on yarn -- when a
request is made to yarn, the spark context tells yarn which nodes it has
blacklisted:
https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala#L128
however, it still seems like there is a missing piece -- you have to tell
mesos which nodes you don't want executors on, right?
I also don't understand why you'd get *starvation* in your app with this --
shouldn't mesos be requesting executors on other nodes?
anyway, I'm agreeing that something seems wrong with the mesos scheduling
when there is a bad node, but I'm not certain this is the right fix, and I just
don't know enough about the communication between mesos and spark to say
exactly what should be done instead, sorry.
@mgummelt can you comment?
might actually be better to have this discussion on jira, since we're
talking about general design, not specifics of this change
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]