[
https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202824#comment-13202824
]
Daryn Sharp commented on MAPREDUCE-3825:
----------------------------------------
This jira was filed after discussions with Sid on other token/viewfs jiras.
There are multiple problems that this jira and the linked jira in common are
trying to address:
# {{FileSystem#getDelegationTokens(String, Credentials)}} will fetch duplicate
tokens.
# {{ViewFileSystem#getDelegationTokens(String)}} will fetch duplicate tokens,
ie. a token for every mount point even if the filesystems are identical.
# {{ViewFileSystem#getDelegationTokens(String, Credentials)}} will skip
filesystems w/o a serviceName, even though that means that the filesystem
doesn't have a token, although it may be filtering a filesystem that does have
tokens.
# {{ViewFileSystem#getDelegationTokens(String, Credentials)}} calls
{{targetFileSystem.getDelegationTokens(String)}} which may acquire duplicate
tokens, even for the services that viewfs thinks it's already seen.
# {{ViewFileSystem#getDelegationTokens(String, Credentials}} will acquire
multiple duplicate tokens for a filtered filesystem & its contained filesystem
because it checks the service of the filtered fs, not the contained fs. A
duplicate token will be acquired for every path. Although not implemented,
unionfs will exasperate the multiple duplicate tokens.
# {{TokenCache}} thinks the viewfs authority is a service name so it tries to
resolve it as a hostname:port tuple and fails.
# {{TokenCache}} assumes a 1 to 1 mapping between a filesystem's service and
its token which is broken for a 1 to many token filesystem. This causes
{{TokenCache}} to repeatedly fetch tokens from a multi-token filesystem because
it never gets a token with the expected service.
Those are the issues that I can recall off the top of my head. The approach
I've taken is:
* Allow the retrieval of unique set of filesystems used
* Query each filesystem only once
* Never retrieve duplicate tokens because the list is unique
* Solve the 1 to many problem in {{TokenCache}}
* Fix the filtered filesystem issue by querying its underlying filesystem
* Fix the viewfs mount table problem by only requesting tokens from the mounted
filesystems
* Complete backwards compatibility with the existing api
The current model is too complex and won't scale. It arguably "works" in
simple cases, but it acquires multiple tokens, errors out if the authority
isn't a service, violates the contract that null service is no token, and won't
work with more complex layering of filesystems. By flattening out the list of
filesystems, the {{ViewFileSystem}} implementation is dramatically simpler, and
it will handle all types of filesystem layering w/o acquiring multiple tokens.
bq. When we changed from getDelegationToken() to getDelegationTokens() we had
dismissed the alternate you are proposing since we needed a method to get
delegation token from a file system anyway.
I'm not sure what this means. If you are referring to token renewal, this jira
is completely orthogonal and is not trying to implement any of those proposals.
(Although ironically, at that time I wanted to implement
{{getDelegationTokens}} but told there was no use case...) Again though, I
believe I've maintained complete backwards compatibility.
> Need generalized multi-token filesystem support
> -----------------------------------------------
>
> Key: MAPREDUCE-3825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: security
> Affects Versions: 0.23.1, 0.24.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
>
> This is the counterpart to HADOOP-7967. The token cache currently tries to
> assume a filesystem's token service key. The assumption generally worked
> while there was a one to one mapping of filesystem to token. With the advent
> of multi-token filesystems like viewfs, the token cache will try to use a
> service key (ie. for viewfs) that will never exist (because it really gets
> the mounted fs tokens).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira