Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9168#discussion_r42779515
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala 
---
    @@ -130,6 +130,21 @@ class SparkHadoopUtil extends Logging {
         UserGroupInformation.loginUserFromKeytab(principalName, keytabFilename)
       }
     
    +  def addCredentialsToCurrentUser(credentials: Credentials, 
freshHadoopConf: Configuration): Unit ={
    +    UserGroupInformation.getCurrentUser.addCredentials(credentials)
    +
    +    // HACK:
    +    // In HA mode, the function FileSystem.addDelegationTokens only 
returns a token for HA
    +    // NameNode. HDFS Client will generate private tokens for each 
NameNode according to the
    +    // token for HA NameNode and uses these private tokens to communicate 
with each NameNode.
    +    // If spark only update token for HA NameNode, HDFS Client will use 
the old private tokens,
    +    // which will cause token expired Error.
    +    // So:
    +    // We create a new HDFS Client, so that the new HDFS Client will 
generate and update the
    +    // private tokens for each NameNode.
    +    FileSystem.get(freshHadoopConf).close()
    --- End diff --
    
    that's not going to create a new client; instead it goes to 
`get(getDefaultUri(conf), conf)`, which then caches it under a key of (fsURI + 
currentUser)...which is why Hadoop 2 added a way to explicitly get a unique one.
    
    To use this method to get a guaranteed unique instance, the code will need 
to set the bool to disable caching on that configuration , which is done by 
building up a property for that FS scheme:
    
    ```
    String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme);
    conf.setBoolean(disableCacheName, true)
    ```
    
    A bit convoluted, but without it we'll get back the same filesystem 
instance, then stay stuck in the process.
    
    Note that the code to create a unique FS, say some method 
`createUniqueFSInstance(freshHadoopConf: Configuration): FileSystem` could be 
factored out as its own method, with a test to verify that repeated calls did 
return unique objects. That would validate the requirements and catch any 
regressions in the hadoop codebase in future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to