[
https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205518#comment-13205518
]
Daryn Sharp commented on MAPREDUCE-3825:
----------------------------------------
I'm open to alternatives, but performing the elimination of dups is actually
pretty simple:
{code}
static void obtainTokensForNamenodesInternal(Credentials credentials,
Path[] ps, Configuration conf) throws IOException {
--- start new code ---
// use 2 passes to avoid redundant calls to the same filesystems
// start by getting unique set of filesystems for all paths
Set<FileSystem> pathFsSet = new HashSet<FileSystem>();
for (Path p : ps) {
pathFsSet.add(p.getFileSystem(conf));
}
// get the unique set of leaf filesystems
Set<FileSystem> tokenFsSet = new HashSet<FileSystem>();
for (FileSystem fs : pathFsSet) {
tokenFsSet.addAll(fs.getFileSystems());
}
--- end new code ---
// get all the tokens from the now flattened list of leaf filesystems
for (FileSystem fs : tokenFsSet) {
obtainTokensForNamenodesPrivate(fs, credentials, conf);
}
}
{code}
If many files are in the same filesystem, then a lot of necessary processing
occurs, esp. in the case of viewfs.
I may be misunderstanding this variation, but the acquisition of tokens via
recursive calls will require more changes that may break non-hadoop distributed
filesystems. I think it will require code duplication of the default
{{getDelegationTokens(renewer, creds)}}, or a new api that overrides of this
method can use to avoid getting dups. The proposed default implementation of
{{FileSystem#getDelegations(renewer, creds)}} simply iterates
{{this.getFileSystems()}} too. I'll write something up and then we can discuss
a little more.
> Need generalized multi-token filesystem support
> -----------------------------------------------
>
> Key: MAPREDUCE-3825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: security
> Affects Versions: 0.23.1, 0.24.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: MAPREDUCE-3825.patch
>
>
> This is the counterpart to HADOOP-7967. The token cache currently tries to
> assume a filesystem's token service key. The assumption generally worked
> while there was a one to one mapping of filesystem to token. With the advent
> of multi-token filesystems like viewfs, the token cache will try to use a
> service key (ie. for viewfs) that will never exist (because it really gets
> the mounted fs tokens).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira