[
https://issues.apache.org/jira/browse/HADOOP-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113703#comment-13113703
]
Daryn Sharp commented on HADOOP-7510:
-------------------------------------
bq. What problems you ran into that compelled to use https port for service in
hftp file system? Earlier and for hdfs it uses rpc port. Rpc port is used,
because eventually its the Rpc port that is passed in TokenSelector to select a
token (from ipc.Client). Was it tested with secure hftp setup or is there a
different call flow that I am missing?
The way it worked before this change is that {{getUri()}} returned the wrong
port, ie. the https instead of http. Thus if you tried {{newFs =
FileSystem.get(hftpFs.getUri(), conf)}}, you would get a filesystem that
wouldn't work because it would try to talk http on the https port.
My prior patch changed the uri to return the correct port. Unfortunately, the
token renewal was relying on the broken port behavior in the uri (https instead
of http) so that {{getCanonicalService}} would return the https port when it
extracted the authority from the uri. My last change returns the correct port
in uri, and the https port for the service.
Nothing changed with regard to rpc. A hdfs token, whether obtained over rpc or
http, used to be universal and could be used by either transport. The token
renewal change made it so a token acquired over rpc can be used with hftp, but
a token obtained over hftp cannot be used for rpc. Hftp looks for either a
hftp or rpc token, makes a copy of the token and resets the copy's type to rpc.
This copy is then serialized into hftp requests. The renewer requires the
unaltered hftp token to contain the https address. None of this behavior was
changed.
{quote}
Modified the getCanonicalService changes to be compatible with expectations of
the TokenCache.
bq. This method returns null for all the file systems that don't have a valid
authority. Why is this change required?
{quote}
Your change in HADOOP-7661 undid the agreed upon change in HADOOP-7602. I
simply changed it back.
To summarize: The {{TokenCache}} is the only user of
{{getCanonicalServiceName}}. The cache expects the value to be the token's
service. Until just recently, the default behavior for
{{getCanonicalServiceName}} was to encode the authority of the fs's uri into a
service. If the uri had no authority, and thus lacked tokens, it would return
junk values like ":0". No external filesystems that relied on this behavior
could have possibly produced a working token.
Earlier, you were very concerned about the risk of returning an empty string
instead of ":0" for filesystems with no authority. On HADOOP-7602, the
agreement was to return null instead of ":0" and have the token cache skip the
filesystem.
bq. This is a public API, I am uncomfortable modifying it to return null for
all file systems except hdfs, hftp.
That is an incorrect reading of the code. The default is to return a token
service for any filesystem that contains an authority (like before). Null is
not returned for all other filesystems -- null is only returned when the
filesystem has no authority, per HADOOP-7602.
If you are concerned about changing a public api, I'm not sure why completely
changing the semantics of {{getCanonicalServiceName}} is not a cause for
concern. It's sometimes a token service (as the method name implies), or
sometimes a uri. That's inconsistent and very risky and incompatible with the
token cache's expectation of it being a service. Just because it "works"
doesn't mean it's right to abuse the api.
> Tokens should use original hostname provided instead of ip
> ----------------------------------------------------------
>
> Key: HADOOP-7510
> URL: https://issues.apache.org/jira/browse/HADOOP-7510
> Project: Hadoop Common
> Issue Type: Improvement
> Components: security
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: HADOOP-7510-10.patch, HADOOP-7510-2.patch,
> HADOOP-7510-3.patch, HADOOP-7510-4.patch, HADOOP-7510-5.patch,
> HADOOP-7510-6.patch, HADOOP-7510-8.patch, HADOOP-7510-9.patch,
> HADOOP-7510.patch
>
>
> Tokens currently store the ip:port of the remote server. This precludes
> tokens from being used after a host's ip is changed. Tokens should store the
> hostname used to make the RPC connection. This will enable new processes to
> use their existing tokens.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira