mingkangli-db commented on code in PR #48551:
URL: https://github.com/apache/spark/pull/48551#discussion_r1814234935


##########
core/src/main/scala/org/apache/spark/util/SparkUncaughtExceptionHandler.scala:
##########
@@ -50,19 +63,31 @@ private[spark] class SparkUncaughtExceptionHandler(val 
exitOnUncaughtException:
       // We may have been called from a shutdown hook. If so, we must not call 
System.exit().
       // (If we do, we will deadlock.)
       if (!ShutdownHookManager.inShutdown()) {
-        exception match {
-          case _: OutOfMemoryError =>
-            System.exit(SparkExitCode.OOM)
-          case e: SparkFatalException if 
e.throwable.isInstanceOf[OutOfMemoryError] =>
-            // SPARK-24294: This is defensive code, in case that 
SparkFatalException is
-            // misused and uncaught.
-            System.exit(SparkExitCode.OOM)
-          case _: KilledByTaskReaperException if exitOnUncaughtException =>
-            System.exit(ExecutorExitCode.KILLED_BY_TASK_REAPER)
-          case _ if exitOnUncaughtException =>
-            System.exit(SparkExitCode.UNCAUGHT_EXCEPTION)
-          case _ =>
-            // SPARK-30310: Don't System.exit() when exitOnUncaughtException 
is false
+        // Traverse the causes up to killOnFatalErrorDepth layers

Review Comment:
   ~~Oops, I think we should address this separately in a follow-up ticket 
because I just realized another issue with uncaught exception handling. In 
`Executor.scala`, the class `KilledByTaskReaperException(message: String) 
extends SparkException(message)`, added in 
[SPARK-48541](https://github.com/apache/spark/pull/46883), is not considered a 
fatal error when checked through `Executor.isFatalError()`:~~
   
   ~~For example, running the following code returns `false`:~~
   
   ```scala
   // Create an instance of the KilledByTaskReaperException
   val exception = new KilledByTaskReaperException("Task was killed by the task 
reaper")
   // Result is false!
   val result = Executor.isFatalError(exception, depthToCheck = 5)
   ```
   
   ~~However, in `SparkUncaughtExceptionHandler`, it is assumed to be a fatal 
error (one of the cases in the handler):~~
   
   ```scala
   case _: KilledByTaskReaperException if exitOnUncaughtException =>
       System.exit(ExecutorExitCode.KILLED_BY_TASK_REAPER)
   ```
   
   ~~Additionally, we seem to assume it should be fatal error here in 
`Executor.scala` as well, from the comments: [code 
link](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L1043-L1048).
 Seems like we expect the exception to bubble up to the 
`uncaughtExceptionHandler` here: [code 
link](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L861-L862),
 but it won't because `Executor.isFatalError()` returns `false`!~~
   
   ~~This inconsistency arises because `Executor.isFatalError` doesn't treat 
`KilledByTaskReaperException` as fatal (when it should!), but we assumed in 
`Executor.scala` and in `SparkUncaughtExceptionHandler` that it is fatal. To 
reuse the logic, we need to fix `Executor.isFatalError` to recognize 
`KilledByTaskReaperException` as a fatal error first.~~
   
   ~~What do you think? @cloud-fan~~
   
   Sorry, I think I misunderstood the exception handling logic. Please 
disregard this.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to