Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19039
  
    The changes you made in `BlacklistTracker` seems break the design purpose 
of backlist. The blacklist in Spark as well as in MR/TEZ assumes bad 
nodes/executors will be back to normal in several hours, so it always has a 
timeout for blacklist.
    
    In your case, the problem is not bad nodes/executors, it is that you don't 
what to start executors on some nodes (like slow nodes). This is more like a 
cluster manager problem rather than Spark problem. To summarize your problem, 
you want your Spark application runs on some specific nodes.
    
    To solve your problem, for YARN you could use node label and Spark on YARN 
already support node label. You could google node label to know the details.
    
    For standalone, simply you should not start worker on such nodes you don't 
want.
    
    For Mesos I'm not sure, I guess it should also has similar approaches.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to