[ 
https://issues.apache.org/jira/browse/HADOOP-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113703#comment-13113703
 ] 

Daryn Sharp commented on HADOOP-7510:
-------------------------------------

bq. What problems you ran into that compelled to use https port for service in 
hftp file system? Earlier and for hdfs it uses rpc port. Rpc port is used, 
because eventually its the Rpc port that is passed in TokenSelector to select a 
token (from ipc.Client). Was it tested with secure hftp setup or is there a 
different call flow that I am missing?

The way it worked before this change is that {{getUri()}} returned the wrong 
port, ie. the https instead of http.  Thus if you tried {{newFs = 
FileSystem.get(hftpFs.getUri(), conf)}}, you would get a filesystem that 
wouldn't work because it would try to talk http on the https port.

My prior patch changed the uri to return the correct port.  Unfortunately, the 
token renewal was relying on the broken port behavior in the uri (https instead 
of http) so that {{getCanonicalService}} would return the https port when it 
extracted the authority from the uri.  My last change returns the correct port 
in uri, and the https port for the service.

Nothing changed with regard to rpc.  A hdfs token, whether obtained over rpc or 
http, used to be universal and could be used by either transport.  The token 
renewal change made it so a token acquired over rpc can be used with hftp, but 
a token obtained over hftp cannot be used for rpc.  Hftp looks for either a 
hftp or rpc token, makes a copy of the token and resets the copy's type to rpc. 
 This copy is then serialized into hftp requests.  The renewer requires the 
unaltered hftp token to contain the https address.  None of this behavior was 
changed.

{quote}
Modified the getCanonicalService changes to be compatible with expectations of 
the TokenCache.
bq. This method returns null for all the file systems that don't have a valid 
authority. Why is this change required? 
{quote}
Your change in HADOOP-7661 undid the agreed upon change in HADOOP-7602.  I 
simply changed it back.

To summarize:  The {{TokenCache}} is the only user of 
{{getCanonicalServiceName}}.  The cache expects the value to be the token's 
service.  Until just recently, the default behavior for 
{{getCanonicalServiceName}} was to encode the authority of the fs's uri into a 
service.  If the uri had no authority, and thus lacked tokens, it would return 
junk values like ":0".  No external filesystems that relied on this behavior 
could have possibly produced a working token.

Earlier, you were very concerned about the risk of returning an empty string 
instead of ":0" for filesystems with no authority.  On HADOOP-7602, the 
agreement was to return null instead of ":0" and have the token cache skip the 
filesystem.

bq. This is a public API, I am uncomfortable modifying it to return null for 
all file systems except hdfs, hftp.
That is an incorrect reading of the code.  The default is to return a token 
service for any filesystem that contains an authority (like before).  Null is 
not returned for all other filesystems -- null is only returned when the 
filesystem has no authority, per HADOOP-7602.

If you are concerned about changing a public api, I'm not sure why completely 
changing the semantics of {{getCanonicalServiceName}} is not a cause for 
concern. It's sometimes a token service (as the method name implies), or 
sometimes a uri.  That's inconsistent and very risky and incompatible with the 
token cache's expectation of it being a service.  Just because it "works" 
doesn't mean it's right to abuse the api.

> Tokens should use original hostname provided instead of ip
> ----------------------------------------------------------
>
>                 Key: HADOOP-7510
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7510
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: security
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7510-10.patch, HADOOP-7510-2.patch, 
> HADOOP-7510-3.patch, HADOOP-7510-4.patch, HADOOP-7510-5.patch, 
> HADOOP-7510-6.patch, HADOOP-7510-8.patch, HADOOP-7510-9.patch, 
> HADOOP-7510.patch
>
>
> Tokens currently store the ip:port of the remote server.  This precludes 
> tokens from being used after a host's ip is changed.  Tokens should store the 
> hostname used to make the RPC connection.  This will enable new processes to 
> use their existing tokens.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to