GitHub user attilapiros opened a pull request:
https://github.com/apache/spark/pull/20203
[SPARK-22577] [core] executor page blacklist status should update with
TaskSet level blacklisting
## What changes were proposed in this pull request?
In this PR stage blacklisting is propagated to UI by introducing a new
Spark listener event (SparkListenerExecutorBlacklistedForStage) which indicates
the executor is blacklisted for a stage (see the existing configuration:
spark.blacklist.stage.maxFailedTasksPerExecutor for details). Blacklisting
state is propagated to the "Aggregated Metrics by Executor" table's
blacklisting column (for a selected stage).
Where after this change three possible labels could be seen:
- "for application": when the executor is blacklisted for the application
(see the configuration spark.blacklist.application.maxFailedTasksPerExecutor
for details)
- "for stage": when the executor is **only** blacklisted for the stage
- "false" : when the executor is not blacklisted at all
## How was this patch tested?
It is tested both manually and with unit tests (including API test via
HistoryServerSuite).
Manually it is tested with a local cluster running Spark as:
```
$ bin/spark-shell --master "local-cluster[2,1,1024]" --conf
"spark.blacklist.enabled=true" --conf
"spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf
"spark.blacklist.application.maxFailedTasksPerExecutor=10" --conf
"spark.eventLog.enabled=true"
```
Executing:
``` scala
import org.apache.spark.SparkEnv
sc.parallelize(1 to 10, 10).map { x =>
if (SparkEnv.get.executorId == "0") throw new RuntimeException("Bad
executor")
else (x % 3, x)
}.reduceByKey((a, b) => a + b).collect()
```
To see result check the "Aggregated Metrics by Executor" section at the
bottom of picture:
[UI screenshot for stage level
blacklisting](https://issues.apache.org/jira/secure/attachment/12905283/stage_blacklisting.png)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/attilapiros/spark SPARK-22577
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20203.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20203
----
commit 8d736c1cd56e341d4d7da88bae01ac3a47649f80
Author: âattilapirosâ <piros.attila.zsolt@...>
Date: 2018-01-05T20:45:54Z
Propagate stage blacklisting to UI.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]