Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/11746#discussion_r56270216
--- Diff:
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala ---
@@ -67,56 +70,67 @@ private[deploy] class DriverRunner(
private var clock: Clock = new SystemClock()
private var sleeper = new Sleeper {
- def sleep(seconds: Int): Unit = (0 until seconds).takeWhile(f =>
{Thread.sleep(1000); !killed})
+ def sleep(seconds: Int): Unit = Thread.sleep(seconds * 1000)
}
/** Starts a thread to run and manage the driver. */
private[worker] def start() = {
- new Thread("DriverRunner for " + driverId) {
+ workerThread = new Thread("DriverRunner for " + driverId) {
override def run() {
try {
- val driverDir = createWorkingDirectory()
- val localJarFilename = downloadUserJar(driverDir)
-
- def substituteVariables(argument: String): String = argument
match {
- case "{{WORKER_URL}}" => workerUrl
- case "{{USER_JAR}}" => localJarFilename
- case other => other
+ shutdownHook = ShutdownHookManager.addShutdownHook { () =>
+ killProcessAndFinalize(DriverState.KILLED, new
SparkException("Worker shutting down"))
}
- // TODO: If we add ability to submit multiple jars they should
also be added here
- val builder =
CommandUtils.buildProcessBuilder(driverDesc.command, securityManager,
- driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables)
- launchDriver(builder, driverDir, driverDesc.supervise)
+ // prepare driver jars, launch driver and set final state from
process exit code
+ val exitCode = prepareAndLaunchDriver()
+ finalState = if (exitCode == 0) Some(DriverState.FINISHED) else
Some(DriverState.FAILED)
}
catch {
- case e: Exception => finalException = Some(e)
+ case interrupted: InterruptedException =>
+ logInfo("Runner thread for driver " + driverId + "
interrupted")
+ killProcessAndFinalize(DriverState.KILLED, interrupted)
+ case e: Exception =>
+ killProcessAndFinalize(DriverState.ERROR, e)
+ }
+ finally {
+ if (shutdownHook != null)
ShutdownHookManager.removeShutdownHook(shutdownHook)
--- End diff --
From what I could tell, `DriverRunner.kill` is not always called, like if
the driver completes on its own. I remove the hook here so that this wouldn't
cause the DriverRunner object to be hanging around because the
ShutdownHookManager is holding on to a reference.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]