[GitHub] spark pull request: [SPARK-11182] HDFS Delegation Token will be ex...

marsishandsome Mon, 19 Oct 2015 18:54:01 -0700

Github user marsishandsome commented on the pull request:

    https://github.com/apache/spark/pull/9168#issuecomment-149398132
  
    The reason to my opinion is:
    1 Spark AM will get a HDFS Delegation Token and add it to the Current 
User's Credential.
    This Token looks like: 
    token1: "ha-hdfs:hadoop-namenode" -> "Kind: HDFS_DELEGATION_TOKEN, Service: 
ha-hdfs:hadoop-namenode, Ident: (HDFS_DELEGATION_TOKEN token 328709 for test)".
    
    2 DFSClient will generate another 2 Tokens for each NameNode.
    token2: "ha-hdfs://xxx.xxx.xxx.xxx:8020" -> "Kind: HDFS_DELEGATION_TOKEN, 
Service: xxx.xxx.xxx.xxx:8020, Ident: (HDFS_DELEGATION_TOKEN token 328708 for 
test)"
    token3: "ha-hdfs://yyy:yyy:yyy:yyy:8020" -> "Kind: HDFS_DELEGATION_TOKEN, 
Service: yyy:yyy:yyy:yyy:8020, Ident: (HDFS_DELEGATION_TOKEN token 328708 for 
test)"
    
    3 DFSClient will not generate token2 and token3 automatically, when Spark 
update token1.
    DFSClient will only use token2 and token3 to communicate with the 2 Name 
Nodes.
    
    4 FileSystem has cache, calling FileSystem.get will get a cached DFSClient, 
which has old tokens.
    Spark only update token1, but DFSClient will use token2 and token3.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11182] HDFS Delegation Token will be ex...

Reply via email to