[
https://issues.apache.org/jira/browse/SPARK-24182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-24182:
------------------------------------
Assignee: Apache Spark
> Improve error message for client mode when AM fails
> ---------------------------------------------------
>
> Key: SPARK-24182
> URL: https://issues.apache.org/jira/browse/SPARK-24182
> Project: Spark
> Issue Type: Improvement
> Components: YARN
> Affects Versions: 2.3.0
> Reporter: Marcelo Vanzin
> Assignee: Apache Spark
> Priority: Minor
>
> Today, when the client AM fails, there's not a lot of useful information
> printed on the output. Depending on the type of failure, the information
> provided by the YARN AM is also not very useful. For example, you'd see this
> in the Spark shell:
> {noformat}
> 18/05/04 11:07:38 ERROR spark.SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Yarn application has already ended! It might
> have been killed or unable to launch application master.
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:86)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
> [long stack trace]
> {noformat}
> Similarly, on the YARN RM, for certain failures you see a generic error like
> this:
> {noformat}
> ExitCodeException exitCode=10: at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:543) at
> org.apache.hadoop.util.Shell.run(Shell.java:460) at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720) at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:366)
> at
> [blah blah blah]
> {noformat}
> It would be nice if we could provide a more accurate description of what went
> wrong when possible.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]