HeartSaVioR commented on a change in pull request #28336:
URL: https://github.com/apache/spark/pull/28336#discussion_r420845743



##########
File path: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
##########
@@ -867,6 +868,20 @@ object ApplicationMaster extends Logging {
         val originalCreds = 
UserGroupInformation.getCurrentUser().getCredentials()
         SparkHadoopUtil.get.loginUserFromKeytab(principal, 
sparkConf.get(KEYTAB).orNull)
         val newUGI = UserGroupInformation.getCurrentUser()
+
+        if (master.isClusterMode) {
+          // Set the context class loader so that the token manager has access 
to jars
+          // distributed by the user.
+          Utils.withContextClassLoader(master.userClassLoader) {
+            // Re-obtain delegation tokens, as they might be outdated as of 
now. Add the fresh
+            // tokens on top of the original user's credentials (overwrite).
+            // This is only needed in cluster mode, because in client mode, AM 
will soon retrieve
+            // the latest tokens from the driver.
+            val credentialManager = new 
HadoopDelegationTokenManager(sparkConf, yarnConf, null)

Review comment:
       The section `Why are the changes needed?` in the description of the PR 
covers it, so please go through it. We "intentionally" obtain delegation tokens 
here even though driver will do it again, because driver (a.k.a user code) 
"before" initializing SparkContext may require delegation tokens. (or login 
with keytab)
   
   The sample application demonstrates the issue clearly - force kill AM after 
initial delegation tokens have been expired, then new attempt of AM would fail 
with expired tokens.
   
   
https://github.com/HeartSaVioR/spark-delegation-token-experiment/blob/master/src/main/scala/net/heartsavior/spark/example/LongRunningAppWithHDFSConfig.scala




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to