HeartSaVioR opened a new pull request #28336: URL: https://github.com/apache/spark/pull/28336
### What changes were proposed in this pull request? This patch re-obtain tokens at the start of AM, if principal and keytab are available. It basically transfers the credentials from the original user, so this patch puts the new tokens into credentials from the original user via overwriting. To obtain tokens from providers in user application, this patch leverages the user classloader as context classloader while initializing token manager in the startup of AM. ### Why are the changes needed? Submitter will obtain delegation tokens for yarn-cluster mode, and add these credentials to the launch context. AM will be launched with these credentials, and AM and driver are able to leverage these tokens. In Yarn cluster mode, driver is launched in AM, which in turn initializes token manager (while initializing SparkContext) and obtain delegation tokens (+ schedule to renew) if both principal and keytab are available. That said, even we provide principal and keytab to run application with yarn-cluster mode, AM always starts with initial tokens from launch context until token manager runs and obtains delegation tokens. So there's a "gap", and if user codes (driver) access to external system with delegation tokens (e.g. HDFS) before initializing SparkContext, it cannot leverage the tokens token manager will obtain. It will make the application fail if AM is killed "after" the initial tokens are expired and relaunched. It's a regression: see below codes in branch-2.4: https://github.com/apache/spark/blob/branch-2.4/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala https://github.com/apache/spark/blob/branch-2.4/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/AMCredentialRenewer.scala In Spark 2.4.x, AM runs AMCredentialRenewer at initialization, and AMCredentialRenew obtains tokens and merge with credentials being provided with launch context of AM. So it guarantees new tokens in driver run. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually tested with specifically crafted application (simple reproducer) - https://github.com/HeartSaVioR/spark-delegation-token-experiment/blob/master/src/main/scala/net/heartsavior/spark/example/LongRunningAppWithHDFSConfig.scala Before this patch, new AM attempt failed when I killed AM after the expiration of tokens. After this patch the new AM attempt runs fine. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
