[
https://issues.apache.org/jira/browse/SPARK-31812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mohammad Islam updated SPARK-31812:
-----------------------------------
Fix Version/s: (was: 2.4.7)
> Spark to support the auto cancelation of delegation token when an Application
> completes
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-31812
> URL: https://issues.apache.org/jira/browse/SPARK-31812
> Project: Spark
> Issue Type: Improvement
> Components: Spark Submit
> Affects Versions: 2.4.5
> Reporter: Mohammad Islam
> Priority: Major
>
> *Context* :
> YARN application provides a client API
> [setCancelTokensWhenComplete|http://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-api/apidocs/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.html#setCancelTokensWhenComplete(boolean)]
> to manage the delegation token(DT) lifecycle. By default, YARN [cancels the
> DT|https://github.com/apache/hadoop/blob/8f78aeb2500011e568929b585ed5b0987355f88d/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto#L513]
> when App finishes. However, the user can override this NOT to cancel the DT
> after the App completes. In some instances, this is required to lessen the
> HDFS/KMS memory footprints by reducing the outstanding DTs.
> MR and TEZ already allow that through client config such as
> _mapreduce.job.complete.cancel.delegation.tokens_ and
> _tez.cancel.delegation.tokens.on.completion_ respectively_._
> *Proposal* :
> Currently, Spark doesn't support it. However, we may need to manage the
> lifecycle of DT outside YARN/Spark framework.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]