Github user squito commented on the issue:
https://github.com/apache/spark/pull/19338
@caneGuy thanks for working on this, looks very reasonable to me, I am
going to take a closer look at a couple of details. But can you make a couple
of updates in the meantime:
1) Can you open a new jira for this, and put that in the commit summary?
SPARK-21539 is referring to something else entirely
2) Can you reformat the new exception to look a bit more like the
formatting for when there are too many failures of a specific task? maybe like
this:
```
User class threw exception: org.apache.spark.SparkException: Job aborted
due to stage failure: Aborting TaskSet 0.0 because task 0 (partition 0) cannot
run anywhere due to node and executor blacklist. Most recent failure:
Lost task 0.1 in stage 0.0 (TID 3,xxx, executor 1): java.lang.Exception:
Fake error!
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:73)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:305)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Blacklisting behavior can be configured via spark.blacklist.*.
Driver Stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1458)
...
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]