GitHub user LuciferYang opened a pull request:
https://github.com/apache/spark/pull/22149
[SPARK-25158][SQL]Executor accidentally exit because
ScriptTransformationWriterThread throws TaskKilledException.
## What changes were proposed in this pull request?
Run Spark-Sql job use transform features(`ScriptTransformationExec`) with
config `spark.speculation = true`, when ScriptTransformationWriterThread runing
in `foreach` part code and driver send cmd to kill speculative task,
`ScriptTransformationWriterThread` will throw `TaskKilledException` becasuse of
`interrupted` of `TaskContext` is true, at the catch block of
`ScriptTransformationWriterThread` assign the `TaskKilledException` to
`_exception` var, `ScriptTransformationExec` and `TaskRunner` will use this
status to complete TaskKill process, this is the right process. But on the
other hand `TaskKilledException` rethrow by `ScriptTransformationWriterThread`
and captured by `Utils.logUncaughtExceptions` function, the
`Utils.logUncaughtExceptions` function log the `TaskKilledException` and
rethrow it again and it will captured by `SparkUncaughtExceptionHandler` which
registered during Executor start, the `SparkUncaughtExceptionHandler` will log
the `TaskKilledException` then call `
System.exit (SparkExitCode.UNCAUGHT_EXCEPTION)` to shutdown Executor.
This PR aims to add protection of above scene:
- Add case matches for `TaskKilledException` in
`ScriptTransformationWriterThread` catch block, only log and assign
`TaskKilledException` to `_exception`, no longer rethrow it.
## How was this patch tested?
In local model SparkUncaughtExceptionHandler not registered during Executor
start, we re-run the user job with above change and set `spark.speculation =
true`, the problem no longer reappears.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/LuciferYang/spark fix-transformation-task-kill
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22149.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22149
----
commit 412497f2ad615e5aeecb91e7fd5053864a00be37
Author: yangjie01 <yangjie01@...>
Date: 2018-08-16T09:07:09Z
fix Executor exit cause by ScriptTransformationWriterThread throw
TaskKilledException
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]