[ 
https://issues.apache.org/jira/browse/SPARK-30529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-30529:
-------------------------------------

    Assignee: Thomas Graves

> Improve error messages when Executor dies before registering with driver
> ------------------------------------------------------------------------
>
>                 Key: SPARK-30529
>                 URL: https://issues.apache.org/jira/browse/SPARK-30529
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>            Priority: Major
>
> currently when you give a bad configuration for accelerator aware scheduling 
> to the executor, the Executors can die but its hard for the user to know why. 
>  The executor dies and logs in its log files what went wrong but many times 
> it hard to find those logs because the executor hasn't registered yet.  Since 
> it hasn't registered the executor doesn't show up on UI to see log files.
> One specific example is you give a discovery script that that doesn't find 
> all the GPUs:
> {code}
> 20/01/16 08:59:24 INFO YarnCoarseGrainedExecutorBackend: Connecting to 
> driver: spark://CoarseGrainedScheduler@10.28.9.112:44403
> 20/01/16 08:59:24 ERROR Inbox: Ignoring error
> java.lang.IllegalArgumentException: requirement failed: Resource: gpu, with 
> addresses: 0 is less than what the user requested: 2)
>  at scala.Predef$.require(Predef.scala:281)
>  at 
> org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1(ResourceUtils.scala:251)
>  at 
> org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1$adapted(ResourceUtils.scala:248)
> {code}
>  
> Figure out a better way of logging or letting user know  what error occurred 
> when the executor dies before registering



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to