[
https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Graves updated SPARK-24415:
----------------------------------
Attachment: Screen Shot 2018-05-29 at 2.15.38 PM.png
> Stage page aggregated executor metrics wrong when failures
> -----------------------------------------------------------
>
> Key: SPARK-24415
> URL: https://issues.apache.org/jira/browse/SPARK-24415
> Project: Spark
> Issue Type: Bug
> Components: Web UI
> Affects Versions: 2.3.0
> Reporter: Thomas Graves
> Priority: Major
> Attachments: Screen Shot 2018-05-29 at 2.15.38 PM.png
>
>
> Running with spark 2.3 on yarn and having task failures and blacklisting, the
> aggregated metrics by executor are not correct. In my example it should have
> 2 failed tasks but it only shows one.
> I will attach screen shot.
> To reproduce:
> $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client
> --executor-memory=2G --num-executors=1 --conf "spark.blacklist.enabled=true"
> --conf "spark.blacklist.stage.maxFailedTasksPerExecutor=1" --conf
> "spark.blacklist.stage.maxFailedExecutorsPerNode=1" --conf
> "spark.blacklist.application.maxFailedTasksPerExecutor=2" --conf
> "spark.blacklist.killBlacklistedExecutors=true"
>
> sc.parallelize(1 to 10000, 10).map { x =>
> | if (SparkEnv.get.executorId.toInt >= 1 && SparkEnv.get.executorId.toInt <=
> 4) throw new RuntimeException("Bad executor")
> | else (x % 3, x)
> | }.reduceByKey((a, b) => a + b).collect()
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]