[ https://issues.apache.org/jira/browse/SPARK-30529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun reassigned SPARK-30529: ------------------------------------- Assignee: Thomas Graves > Improve error messages when Executor dies before registering with driver > ------------------------------------------------------------------------ > > Key: SPARK-30529 > URL: https://issues.apache.org/jira/browse/SPARK-30529 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Thomas Graves > Assignee: Thomas Graves > Priority: Major > > currently when you give a bad configuration for accelerator aware scheduling > to the executor, the Executors can die but its hard for the user to know why. > The executor dies and logs in its log files what went wrong but many times > it hard to find those logs because the executor hasn't registered yet. Since > it hasn't registered the executor doesn't show up on UI to see log files. > One specific example is you give a discovery script that that doesn't find > all the GPUs: > {code} > 20/01/16 08:59:24 INFO YarnCoarseGrainedExecutorBackend: Connecting to > driver: spark://CoarseGrainedScheduler@10.28.9.112:44403 > 20/01/16 08:59:24 ERROR Inbox: Ignoring error > java.lang.IllegalArgumentException: requirement failed: Resource: gpu, with > addresses: 0 is less than what the user requested: 2) > at scala.Predef$.require(Predef.scala:281) > at > org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1(ResourceUtils.scala:251) > at > org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1$adapted(ResourceUtils.scala:248) > {code} > > Figure out a better way of logging or letting user know what error occurred > when the executor dies before registering -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org