[GitHub] [spark] AngersZhuuuu commented on pull request #40314: [SPARK-42698][CORE] SparkSubmit should pass exitCode to AM side

via GitHub Tue, 07 Mar 2023 18:51:25 -0800


AngersZhuuuu commented on PR #40314:
URL: https://github.com/apache/spark/pull/40314#issuecomment-1459223471


   > Hi, @AngersZhuuuu .
   > This PR seems to have insufficient information. Could you provide more 
details about how to validate this in what environment?
   
   We run a client mode SparkSubmit job and throw below exception
   ```
   23/03/07 18:34:50 INFO YarnClientSchedulerBackend: Shutting down all 
executors
   23/03/07 18:34:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each 
executor to shut down
   23/03/07 18:34:50 INFO YarnClientSchedulerBackend: YARN client scheduler 
backend Stopped
   23/03/07 18:34:50 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   23/03/07 18:34:50 INFO BlockManager: BlockManager stopped
   23/03/07 18:34:50 INFO BlockManagerMaster: BlockManagerMaster stopped
   23/03/07 18:34:50 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   23/03/07 18:34:50 INFO SparkContext: Successfully stopped SparkContext
   Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or 
view not found: xxx.xxx; line 1 pos 14;
   'GlobalLimit 1
   +- 'LocalLimit 1
      +- 'Project [*]
         +- 'UnresolvedRelation [xxx, xxx], [], false
   
        at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:115)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:95)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:184)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:183)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:183)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:183)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:183)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:95)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:92)
        at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:155)
        at 
org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:178)
        at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:228)
        at 
org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:175)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:73)
        at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:778)
        at 
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143)
        at 
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:73)
        at 
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:71)
        at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:63)
        at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:778)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
        at 
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:621)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:778)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:616)
        at 
org.apache.spark.sql.auth.QueryAuthChecker$.main(QueryAuthChecker.scala:33)
        at 
org.apache.spark.sql.auth.QueryAuthChecker.main(QueryAuthChecker.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   23/03/07 18:34:50 INFO ShutdownHookManager: Shutdown hook called
   23/03/07 18:34:50 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-8ce833e1-3cd4-4a9f-960d-695be85b12f4
   23/03/07 18:34:50 INFO ShutdownHookManager: Deleting directory 
/hadoop/spark/sparklocaldir/spark-58bbd530-6144-4ad6-b62b-a690baac9f96
   23/03/07 18:34:50 INFO SparkExecutionPlanProcessor: Lineage thread pool 
prepares to shut down
   23/03/07 18:34:50 INFO SparkExecutionPlanProcessor: Lineage thread pool 
finishes to await termination and shuts down
   
   ```
   
   
   This job failed, but with call `sparkContext.stop()`, client side failed but 
in AM it shows SUCCESS
   In spark-3.1.2 the code like this
   ```
       try {
         app.start(childArgs.toArray, sparkConf)
       } catch {
         case t: Throwable =>
           throw findCause(t)
       } finally {
         if (!isShell(args.primaryResource) && !isSqlShell(args.mainClass) &&
           !isThriftServer(args.mainClass)) {
           try {
             SparkContext.getActive.foreach(_.stop())
           } catch {
             case e: Throwable => logError(s"Failed to close SparkContext: $e")
           }
         }
       }
   ```
   
   So here for normal job, I think we should pass the exit code to 
SchedulerBackend, right?
   
   
   Then after your mention, I see that 
https://github.com/apache/spark/pull/33403 change the behavior that only k8s 
call `sc.stop()`, then I think for k8s and yarn mode we booth need to pass the 
exit code the backend.
   
   After this pr, we also need to check if k8s backend exit code is same as 
client side in client mode too.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu commented on pull request #40314: [SPARK-42698][CORE] SparkSubmit should pass exitCode to AM side

Reply via email to