JoshRosen commented on code in PR #46883:
URL: https://github.com/apache/spark/pull/46883#discussion_r1630167906


##########
core/src/main/scala/org/apache/spark/executor/Executor.scala:
##########
@@ -1310,3 +1309,5 @@ private[spark] object Executor {
     }
   }
 }
+
+class KilledByTaskReaperException(message: String) extends 
SparkException(message)

Review Comment:
   Before, it looks like we'd throw `.internalError(msg, category = 
"EXECUTOR")` but now we're throwing a new SparkException with no error class 
set, so the resulting error class now becomes `INTERNAL_ERROR` whereas before 
it became `INTERNAL_ERROR_EXECUTOR`
   
   Maybe we should migrate this to the error classes framework? 
   
   If we don't error class migrate it now, perhaps we should have this be 
`extends SparkException(message, cause = null, errorClass = 
"INTERNAL_ERROR_EXECUTOR")` so that we preserve the error class?
   
   On the other hand, I'm not sure that this exception can even bubble to user 
code so in practice the exact error class might not matter: we're really just 
using this for control flow.



##########
core/src/test/scala/org/apache/spark/JobCancellationSuite.scala:
##########
@@ -455,6 +464,8 @@ class JobCancellationSuite extends SparkFunSuite with 
Matchers with BeforeAndAft
     sc.cancelJobGroup("jobA")
     val e = intercept[SparkException] { ThreadUtils.awaitResult(jobA, 
15.seconds) }.getCause
     assert(e.getMessage contains "cancel")
+    semExec.acquire(2)
+    assert(execLossReason == Seq("Command exited with code 53", "Command 
exited with code 53"))

Review Comment:
   We could use the `ExecutorExitCode` constant here to make find usages work a 
little clearer.



##########
core/src/main/scala/org/apache/spark/util/SparkExitCode.scala:
##########
@@ -45,6 +45,10 @@ private[spark] object SparkExitCode {
       OutOfMemoryError. */
   val OOM = 52
 
+  /** The default uncaught exception handler was reached and the exception was 
thrown by
+     TaskReaper. */
+  val KILLED_BY_TASK_REAPER = 53

Review Comment:
   I did a bit of digging to figure out whether exit code 53 might have had any 
special meaning in Spark or in other systems and found 
https://stackoverflow.com/questions/45428145/do-exit-codes-and-exit-statuses-mean-anything-in-spark,
 which reminded me of the existence of `ExecutorExitCode.scala` where it's 
already used for a BlockManager-related exit condition:
   
   
https://github.com/apache/spark/blob/0f21df0b29cc18f0e0c7b12543f3a037e4032e65/core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala#L30-L34
   
   Given that the task reaper kill is an executor-specific behavior, I think we 
should be adding a new exit code in `ExecutorExitCode.scala`. Let's maybe use 
the next free code, 57, for that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to