JoshRosen commented on code in PR #46883:
URL: https://github.com/apache/spark/pull/46883#discussion_r1630167906
##########
core/src/main/scala/org/apache/spark/executor/Executor.scala:
##########
@@ -1310,3 +1309,5 @@ private[spark] object Executor {
}
}
}
+
+class KilledByTaskReaperException(message: String) extends
SparkException(message)
Review Comment:
Before, it looks like we'd throw `.internalError(msg, category =
"EXECUTOR")` but now we're throwing a new SparkException with no error class
set, so the resulting error class now becomes `INTERNAL_ERROR` whereas before
it became `INTERNAL_ERROR_EXECUTOR`
Maybe we should migrate this to the error classes framework?
If we don't error class migrate it now, perhaps we should have this be
`extends SparkException(message, cause = null, errorClass =
"INTERNAL_ERROR_EXECUTOR")` so that we preserve the error class?
On the other hand, I'm not sure that this exception can even bubble to user
code so in practice the exact error class might not matter: we're really just
using this for control flow.
##########
core/src/test/scala/org/apache/spark/JobCancellationSuite.scala:
##########
@@ -455,6 +464,8 @@ class JobCancellationSuite extends SparkFunSuite with
Matchers with BeforeAndAft
sc.cancelJobGroup("jobA")
val e = intercept[SparkException] { ThreadUtils.awaitResult(jobA,
15.seconds) }.getCause
assert(e.getMessage contains "cancel")
+ semExec.acquire(2)
+ assert(execLossReason == Seq("Command exited with code 53", "Command
exited with code 53"))
Review Comment:
We could use the `ExecutorExitCode` constant here to make find usages work a
little clearer.
##########
core/src/main/scala/org/apache/spark/util/SparkExitCode.scala:
##########
@@ -45,6 +45,10 @@ private[spark] object SparkExitCode {
OutOfMemoryError. */
val OOM = 52
+ /** The default uncaught exception handler was reached and the exception was
thrown by
+ TaskReaper. */
+ val KILLED_BY_TASK_REAPER = 53
Review Comment:
I did a bit of digging to figure out whether exit code 53 might have had any
special meaning in Spark or in other systems and found
https://stackoverflow.com/questions/45428145/do-exit-codes-and-exit-statuses-mean-anything-in-spark,
which reminded me of the existence of `ExecutorExitCode.scala` where it's
already used for a BlockManager-related exit condition:
https://github.com/apache/spark/blob/0f21df0b29cc18f0e0c7b12543f3a037e4032e65/core/src/main/scala/org/apache/spark/executor/ExecutorExitCode.scala#L30-L34
Given that the task reaper kill is an executor-specific behavior, I think we
should be adding a new exit code in `ExecutorExitCode.scala`. Let's maybe use
the next free code, 57, for that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]