Imran Rashid commented on SPARK-23485:

Yeah I don't think its safe to assume that its kubernetes responsibility to 
entirely figure out the equivalent of a spark application's internal blacklist. 
 You can't guarantee that it'll detect hardware issues, and it also might be an 
issue which is specific to the spark application (eg. a missing jar). 
 Yarn has some basic detection of bad nodes as well, but we observed cases in 
production where one bad disk would effectively take out an entire application 
on a large cluster without spark's blacklisting, as you could have many task 
failures pile up very quickly.

That said, the existing blacklist implementation in spark already handles that 
case, even without the extra handling I'm proposing here.  The spark app would 
still have its own node blacklist, and would avoid scheduling tasks on that 

However, this is suboptimal because spark isn't really getting as many 
resources as it should.  Eg.,  it would request 10 executors, kubernetes hands 
it 10, but really spark can only use 8 of them because 2 live on a node that is 

I don't think this can be directly handled with taints, if I understand 
correctly.  I assume applying a taint is an admin level thing?  that would mean 
a spark app couldn't dynamically apply a taint when it discovers a problem on a 
node (and really, it probably shouldn't be able to, as it shouldn't trust an 
arbitrary user).  Furthermore, it doesn't allow it to be application specific 
-- blacklisting is really just a heuristic, and you probably do not want it to 
be applied across applications.  Its not clear what you'd do with multiple apps 
each with their own blacklist, as nodes go into the blacklist and then move out 
of the blacklist at different times from each app.

> Kubernetes should support node blacklist
> ----------------------------------------
>                 Key: SPARK-23485
>                 URL: https://issues.apache.org/jira/browse/SPARK-23485
>             Project: Spark
>          Issue Type: New Feature
>          Components: Kubernetes, Scheduler
>    Affects Versions: 2.3.0
>            Reporter: Imran Rashid
>            Priority: Major
> Spark's BlacklistTracker maintains a list of "bad nodes" which it will not 
> use for running tasks (eg., because of bad hardware).  When running in yarn, 
> this blacklist is used to avoid ever allocating resources on blacklisted 
> nodes: 
> https://github.com/apache/spark/blob/e836c27ce011ca9aef822bef6320b4a7059ec343/resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala#L128
> I'm just beginning to poke around the kubernetes code, so apologies if this 
> is incorrect -- but I didn't see any references to 
> {{scheduler.nodeBlacklist()}} in {{KubernetesClusterSchedulerBackend}} so it 
> seems this is missing.  Thought of this while looking at SPARK-19755, a 
> similar issue on mesos.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to