tgravescs commented on a change in pull request #34491:
URL: https://github.com/apache/spark/pull/34491#discussion_r749696609
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala
##########
@@ -447,6 +447,13 @@ package object config extends Logging {
.stringConf
.createWithDefault("yarn.io/fpga")
+ private[spark] val APPLICATION_LIFETIME_TIMEOUT =
ConfigBuilder("spark.yarn.app.lifetime.timeout")
+ .doc("Overall timeout for a spark application running on yarn from
submitted time, " +
Review comment:
Spark should be capitalized, YARN should be capitalized.
This should explicitly describe the LIFETIME timeout. It doesn't look like
there has been any work on adding other timeout values so I think hardcoding it
to lifetime is sufficient for now.
Change to "from the time it was submitted. By default it's unlimited"
Also add text about which versions of YARN support this option
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala
##########
@@ -447,6 +447,13 @@ package object config extends Logging {
.stringConf
.createWithDefault("yarn.io/fpga")
+ private[spark] val APPLICATION_LIFETIME_TIMEOUT =
ConfigBuilder("spark.yarn.app.lifetime.timeout")
Review comment:
we are trying to not extend configs with more components (ie .xx.yy.) I
think spark.yarn.app.lifetimeTimeout should be sufficient.
Also you need to update docs/running-on-yarn.md
##########
File path:
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
##########
@@ -213,6 +214,10 @@ class ClientSuite extends SparkFunSuite with Matchers {
}
appContext.getMaxAppAttempts should be (42)
appContext.getPriority.getPriority should be (1)
+ if (isApplicationTimeoutAvailable) {
+ val maybeTimeouts =
ResourceRequestTestHelper.getApplicationTimeouts(appContext)
+ assert(maybeTimeouts.get === 12345L)
+ }
}
Review comment:
it would be nice to have an actual test that times out from this. Can we
add one to YarnClusterSuite?
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala
##########
@@ -41,6 +42,9 @@ private object ResourceRequestHelper extends Logging {
private val AMOUNT_AND_UNIT_REGEX = "([0-9]+)([A-Za-z]*)".r
private val RESOURCE_INFO_CLASS =
"org.apache.hadoop.yarn.api.records.ResourceInformation"
private val RESOURCE_NOT_FOUND =
"org.apache.hadoop.yarn.exceptions.ResourceNotFoundException"
+ private val APPLICATION_TIMEOUT_CLASS =
"org.apache.hadoop.yarn.api.records.ApplicationTimeout"
Review comment:
Resourcerequest helper doesn't seem like the right place for any of
these changes since its not dealing with "resources".
I would rather just see this in Client class since its only used there
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala
##########
@@ -248,4 +252,30 @@ private object ResourceRequestHelper extends Logging {
def isYarnResourceTypesAvailable(): Boolean = {
Try(Utils.classForName(RESOURCE_INFO_CLASS)).isSuccess
}
+
+ /**
+ * YARN support application timeout feature since 2.9.0/3.0.0
+ * https://issues.apache.org/jira/browse/YARN-3813
+ */
+ def isApplicationTimeoutAvailable: Boolean = {
+ Try {
+ Utils.classForName(APPLICATION_TIMEOUT_CLASS)
+ Utils.classForName(APPLICATION_TIMEOUT_TYPE_CLASS)
+ }.isSuccess
+ }
+
+ def setApplicationTimeouts(appContext: ApplicationSubmissionContext,
timeout: Long): Unit = {
+ if (isApplicationTimeoutAvailable) {
+ // Currently, YARN only support an overall LIFETIME application timeout
type
+ val timeoutType =
+ Utils.classForName[Enum[_]](APPLICATION_TIMEOUT_TYPE_CLASS, initialize
= false)
+ .getEnumConstants.head
+ val timeouts = Collections.singletonMap(timeoutType, timeout)
Review comment:
do we need any bounds checking on timeout? ie what if its 0 or -10?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]