[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202824#comment-13202824
 ] 

Daryn Sharp commented on MAPREDUCE-3825:
----------------------------------------

This jira was filed after discussions with Sid on other token/viewfs jiras.  
There are multiple problems that this jira and the linked jira in common are 
trying to address:
# {{FileSystem#getDelegationTokens(String, Credentials)}} will fetch duplicate 
tokens.
# {{ViewFileSystem#getDelegationTokens(String)}} will fetch duplicate tokens, 
ie. a token for every mount point even if the filesystems are identical.
# {{ViewFileSystem#getDelegationTokens(String, Credentials)}} will skip 
filesystems w/o a serviceName, even though that means that the filesystem 
doesn't have a token, although it may be filtering a filesystem that does have 
tokens.
# {{ViewFileSystem#getDelegationTokens(String, Credentials)}} calls 
{{targetFileSystem.getDelegationTokens(String)}} which may acquire duplicate 
tokens, even for the services that viewfs thinks it's already seen.
# {{ViewFileSystem#getDelegationTokens(String, Credentials}} will acquire 
multiple duplicate tokens for a filtered filesystem & its contained filesystem 
because it checks the service of the filtered fs, not the contained fs.  A 
duplicate token will be acquired for every path.  Although not implemented, 
unionfs will exasperate the multiple duplicate tokens.
# {{TokenCache}} thinks the viewfs authority is a service name so it tries to 
resolve it as a hostname:port tuple and fails.
# {{TokenCache}} assumes a 1 to 1 mapping between a filesystem's service and 
its token which is broken for a 1 to many token filesystem.  This causes 
{{TokenCache}} to repeatedly fetch tokens from a multi-token filesystem because 
it never gets a token with the expected service.

Those are the issues that I can recall off the top of my head.  The approach 
I've taken is:
* Allow the retrieval of unique set of filesystems used
* Query each filesystem only once
* Never retrieve duplicate tokens because the list is unique
* Solve the 1 to many problem in {{TokenCache}}
* Fix the filtered filesystem issue by querying its underlying filesystem
* Fix the viewfs mount table problem by only requesting tokens from the mounted 
filesystems
* Complete backwards compatibility with the existing api

The current model is too complex and won't scale.  It arguably "works" in 
simple cases, but it acquires multiple tokens, errors out if the authority 
isn't a service, violates the contract that null service is no token, and won't 
work with more complex layering of filesystems.  By flattening out the list of 
filesystems, the {{ViewFileSystem}} implementation is dramatically simpler, and 
it will handle all types of filesystem layering w/o acquiring multiple tokens.

bq. When we changed from getDelegationToken() to getDelegationTokens() we had 
dismissed the alternate you are proposing since we needed a method to get 
delegation token from a file system anyway.

I'm not sure what this means.  If you are referring to token renewal, this jira 
is completely orthogonal and is not trying to implement any of those proposals. 
 (Although ironically, at that time I wanted to implement 
{{getDelegationTokens}} but told there was no use case...)  Again though, I 
believe I've maintained complete backwards compatibility.

                
> Need generalized multi-token filesystem support
> -----------------------------------------------
>
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>
> This is the counterpart to HADOOP-7967.  The token cache currently tries to 
> assume a filesystem's token service key.  The assumption generally worked 
> while there was a one to one mapping of filesystem to token.  With the advent 
> of multi-token filesystems like viewfs, the token cache will try to use a 
> service key (ie. for viewfs) that will never exist (because it really gets 
> the mounted fs tokens).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to