GitHub user attilapiros opened a pull request:

    https://github.com/apache/spark/pull/20203

    [SPARK-22577] [core] executor page blacklist status should update with 
TaskSet level blacklisting

    ## What changes were proposed in this pull request?
    
    In this PR stage blacklisting is propagated to UI by introducing a new 
Spark listener event (SparkListenerExecutorBlacklistedForStage) which indicates 
the executor is blacklisted for a stage (see the existing configuration: 
spark.blacklist.stage.maxFailedTasksPerExecutor for details). Blacklisting 
state is propagated to the "Aggregated Metrics by Executor" table's 
blacklisting column (for a selected stage). 
    
    Where after this change three possible labels could be seen:
    - "for application": when the executor is blacklisted for the application 
(see the configuration spark.blacklist.application.maxFailedTasksPerExecutor 
for details) 
    - "for stage": when the executor is **only** blacklisted for the stage 
    - "false" : when the executor is not blacklisted at all
    
    ## How was this patch tested?
    
    It is tested both manually and with unit tests (including API test via 
HistoryServerSuite). 
    
    Manually it is tested with a local cluster running Spark as:
    ```
    $ bin/spark-shell --master "local-cluster[2,1,1024]" --conf 
"spark.blacklist.enabled=true" --conf 
"spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf 
"spark.blacklist.application.maxFailedTasksPerExecutor=10" --conf 
"spark.eventLog.enabled=true"
    ```
    
    Executing:
    ``` scala
    import org.apache.spark.SparkEnv
    
    sc.parallelize(1 to 10, 10).map { x =>
      if (SparkEnv.get.executorId == "0") throw new RuntimeException("Bad 
executor")
      else (x % 3, x)
    }.reduceByKey((a, b) => a + b).collect()
    ```
    
    To see result check the "Aggregated Metrics by Executor" section at the 
bottom of picture:
    [UI screenshot for stage level 
blacklisting](https://issues.apache.org/jira/secure/attachment/12905283/stage_blacklisting.png)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/attilapiros/spark SPARK-22577

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20203.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20203
    
----
commit 8d736c1cd56e341d4d7da88bae01ac3a47649f80
Author: “attilapiros” <piros.attila.zsolt@...>
Date:   2018-01-05T20:45:54Z

    Propagate stage blacklisting to UI.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to