Hi all,

Spark currently has blacklisting enabled on Mesos, no matter what:
[SPARK-19755][Mesos] Blacklist is always active for
MesosCoarseGrainedSchedulerBackend

Blacklisting also prevents new drivers from running on our nodes where
previous drivers' had failed tasks.

We've tried restarting Spark dispatcher before sending new tasks. Even
creating new machines (with the same hostname) does not help. 

Looking at  TaskSetBlacklist
<https://github.com/apache/spark/blob/e18d6f5326e0d9ea03d31de5ce04cb84d3b8ab37/core/src/main/scala/org/apache/spark/scheduler/TaskSetBlacklist.scala#L66>
 
, I don't understand how a fresh Spark job submitted from a fresh Spark
Dispatcher starts saying all the nodes are blacklisted right away. How does
Spark know previous task failures?

This issue severely interrupts us. How could we disable blacklisting on
Spark 2.3.0? Creative ideas are welcome :)

Best,
Han



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to