[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205518#comment-13205518
 ] 

Daryn Sharp commented on MAPREDUCE-3825:
----------------------------------------

I'm open to alternatives, but performing the elimination of dups is actually 
pretty simple:
{code}
  static void obtainTokensForNamenodesInternal(Credentials credentials,
       Path[] ps, Configuration conf) throws IOException {
--- start new code ---
    // use 2 passes to avoid redundant calls to the same filesystems
    // start by getting unique set of filesystems for all paths
    Set<FileSystem> pathFsSet = new HashSet<FileSystem>();
    for (Path p : ps) {
      pathFsSet.add(p.getFileSystem(conf));
    }
    // get the unique set of leaf filesystems
    Set<FileSystem> tokenFsSet = new HashSet<FileSystem>();
    for (FileSystem fs : pathFsSet) {
      tokenFsSet.addAll(fs.getFileSystems());
    }
--- end new code ---
    // get all the tokens from the now flattened list of leaf filesystems
    for (FileSystem fs : tokenFsSet) {
      obtainTokensForNamenodesPrivate(fs, credentials, conf);
    }
  }
{code}

If many files are in the same filesystem, then a lot of necessary processing 
occurs, esp. in the case of viewfs.

I may be misunderstanding this variation, but the acquisition of tokens via 
recursive calls will require more changes that may break non-hadoop distributed 
filesystems.  I think it will require code duplication of the default 
{{getDelegationTokens(renewer, creds)}}, or a new api that overrides of this 
method can use to avoid getting dups.  The proposed default implementation of 
{{FileSystem#getDelegations(renewer, creds)}} simply iterates 
{{this.getFileSystems()}} too.  I'll write something up and then we can discuss 
a little more.
                
> Need generalized multi-token filesystem support
> -----------------------------------------------
>
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: MAPREDUCE-3825.patch
>
>
> This is the counterpart to HADOOP-7967.  The token cache currently tries to 
> assume a filesystem's token service key.  The assumption generally worked 
> while there was a one to one mapping of filesystem to token.  With the advent 
> of multi-token filesystems like viewfs, the token cache will try to use a 
> service key (ie. for viewfs) that will never exist (because it really gets 
> the mounted fs tokens).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to