Repository: spark Updated Branches: refs/heads/master 091c2c3ec -> f97e9323b
[SPARK-10739] [YARN] Add application attempt window for Spark on Yarn Add application attempt window for Spark on Yarn to ignore old out of window failures, this is useful for long running applications to recover from failures. Author: jerryshao <ss...@hortonworks.com> Closes #8857 from jerryshao/SPARK-10739 and squashes the following commits: 36eabdc [jerryshao] change the doc 7f9b77d [jerryshao] Style change 1c9afd0 [jerryshao] Address the comments caca695 [jerryshao] Add application attempt window for Spark on Yarn Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f97e9323 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f97e9323 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f97e9323 Branch: refs/heads/master Commit: f97e9323b526b3d0b0fee0ca03f4276f37bb5750 Parents: 091c2c3 Author: jerryshao <ss...@hortonworks.com> Authored: Mon Oct 12 18:17:28 2015 -0700 Committer: Marcelo Vanzin <van...@cloudera.com> Committed: Mon Oct 12 18:18:19 2015 -0700 ---------------------------------------------------------------------- docs/running-on-yarn.md | 9 +++++++++ .../scala/org/apache/spark/deploy/yarn/Client.scala | 14 ++++++++++++++ 2 files changed, 23 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/f97e9323/docs/running-on-yarn.md ---------------------------------------------------------------------- diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index 6d77db6..677c000 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -306,6 +306,15 @@ If you need a reference to the proper location to put log files in the YARN so t </td> </tr> <tr> + <td><code>spark.yarn.am.attemptFailuresValidityInterval</code></td> + <td>(none)</td> + <td> + Defines the validity interval for AM failure tracking. + If the AM has been running for at least the defined interval, the AM failure count will be reset. + This feature is not enabled if not configured, and only supported in Hadoop 2.6+. + </td> +</tr> +<tr> <td><code>spark.yarn.submit.waitAppCompletion</code></td> <td><code>true</code></td> <td> http://git-wip-us.apache.org/repos/asf/spark/blob/f97e9323/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---------------------------------------------------------------------- diff --git a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala index 1fbd18a..d25d830 100644 --- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala +++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala @@ -208,6 +208,20 @@ private[spark] class Client( case None => logDebug("spark.yarn.maxAppAttempts is not set. " + "Cluster's default value will be used.") } + + if (sparkConf.contains("spark.yarn.am.attemptFailuresValidityInterval")) { + try { + val interval = sparkConf.getTimeAsMs("spark.yarn.am.attemptFailuresValidityInterval") + val method = appContext.getClass().getMethod( + "setAttemptFailuresValidityInterval", classOf[Long]) + method.invoke(appContext, interval: java.lang.Long) + } catch { + case e: NoSuchMethodException => + logWarning("Ignoring spark.yarn.am.attemptFailuresValidityInterval because the version " + + "of YARN does not support it") + } + } + val capability = Records.newRecord(classOf[Resource]) capability.setMemory(args.amMemory + amMemoryOverhead) capability.setVirtualCores(args.amCores) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org