Github user dougb commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5031#discussion_r26797658
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -903,6 +908,30 @@ object Client extends Logging {
       }
     
       /**
    +   * Obtains token for the Hive metastore and adds them to the credentials.
    +   */
    +  private def obtainTokenForHiveMetastore(conf: Configuration, 
credentials: Credentials) {
    +    if (UserGroupInformation.isSecurityEnabled /* And Hive is enabled */) {
    +      val hc = org.apache.hadoop.hive.ql.metadata.Hive.get
    +      val principal = 
hc.getConf().get(HiveConf.ConfVars.METASTORE_KERBEROS_PRINCIPAL.varname)
    +      val username = UserGroupInformation.getCurrentUser().getUserName
    +
    +      if (principal == null) {
    +        val errorMessage = "Required hive metastore principal is not 
configured!"
    +        logError(errorMessage)
    +        throw new IllegalArgumentException(errorMessage)
    +      }
    +
    +      val tokenStr = hc.getDelegationToken(username,principal)
    +      val hive2Token = new Token[DelegationTokenIdentifier]()
    +      hive2Token.decodeFromUrlString(tokenStr)
    +      credentials.addToken(new Text("hive.server2.delegation.token"), 
hive2Token)
    +      logDebug("Added the Hive Server 2 token to conf.")
    +      org.apache.hadoop.hive.ql.metadata.Hive.closeCurrent
    --- End diff --
    
    I talked to @marmbrus and @pwendell at the Spark Summit yesterday.
    I think I'm going to take a shot at the reflection approach first. 
    
    I would like to see the various delegation tokens collected automatically.
    I think it would be a burden for the users to remember to do this in every 
job.
     
    I also don't see anything to renew the namenode tokens. I'm still looking 
around to see how its handled in other projects.
    
    I think an expert user could add the delegation tokens to the conf by hand,
     if they knew what the config option was, and how to get and encode the 
token. 
    
    I just started looking deep into spark, but it looks like delegation token 
management could be better. I need to look at how mapred jobs handle this. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to