Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/2350#discussion_r17433191
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -417,41 +381,136 @@ trait ClientBase extends Logging {
"1>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout",
"2>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
- logInfo("Yarn AM launch context:")
- logInfo(s" user class: ${args.userClass}")
- logInfo(s" env: $env")
- logInfo(s" command: ${commands.mkString(" ")}")
-
// TODO: it would be nicer to just make sure there are no null
commands here
val printableCommands = commands.map(s => if (s == null) "null" else
s).toList
amContainer.setCommands(printableCommands)
- setupSecurityToken(amContainer)
+
logDebug("===============================================================================")
+ logDebug("Yarn AM launch context:")
+ logDebug(s" user class: ${args.userClass}")
+ logDebug(" env:")
+ launchEnv.foreach { case (k, v) => logDebug(s" $k -> $v") }
+ logDebug(" resources:")
+ localResources.foreach { case (k, v) => logDebug(s" $k -> $v")}
+ logDebug(" command:")
+ logDebug(s" ${printableCommands.mkString(" ")}")
+
logDebug("===============================================================================")
// send the acl settings into YARN to control who has access via YARN
interfaces
val securityManager = new SecurityManager(sparkConf)
amContainer.setApplicationACLs(YarnSparkHadoopUtil.getApplicationAclsForYarn(securityManager))
-
+ setupSecurityToken(amContainer)
amContainer
}
+
+ /**
+ * Report the state of an application until it has exited, either
successfully or
+ * due to some failure, then return the application state.
+ *
+ * @param returnOnRunning Whether to also return the application state
when it is RUNNING.
+ * @param logApplicationReport Whether to log details of the application
report every iteration.
+ * @return state of the application, one of FINISHED, FAILED, KILLED,
and RUNNING.
+ */
+ def monitorApplication(
+ appId: ApplicationId,
+ returnOnRunning: Boolean = false,
+ logApplicationReport: Boolean = true): YarnApplicationState = {
+ val interval = sparkConf.getLong("spark.yarn.report.interval", 1000)
--- End diff --
Ah right you didn't introduce it so if you would rather we can file
separate jira.
Since this function is now used in multiple different scenarios it actually
might make sense for it to take the timeout as a parameter. You could want
different timeout for different situations.
for instance how quickly we poll on client side and print information
(cluster mode) vs how quickly we recognize the application quit and we want to
terminate (client mode), I want the latter to happen quickly where as in
cluster mode I might not care as much about how often it is printing updated
info to the screen. I guess its private so we could leave it as is and change
if we add support for that later.
my suggestion for name would be something like
spark.yarn.client.progress.pollinterval. If we were to add separate ones in the
future then they could be something like spark.yarn.app.ready.pollinterval and
spark.yarn.app.completion.pollinterval
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]