[
https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336954#comment-14336954
]
Haohui Mai commented on HDFS-7816:
----------------------------------
Take a closer look of the RFC -- the syntax of HTTP URLs are specified in RFC
1738.
{code}
; HTTP
httpurl = "http://" hostport [ "/" hpath [ "?" search ]]
hpath = hsegment *[ "/" hsegment ]
hsegment = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
search = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
safe = "$" | "-" | "_" | "." | "+"
extra = "!" | "*" | "'" | "(" | ")" | ","
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "="
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" |
"a" | "b" | "c" | "d" | "e" | "f"
escape = "%" hex hex
unreserved = alpha | digit | safe | extra
uchar = unreserved | escape
{code}
So it looks like it is okay to not encode "+". However, the current webhdfs
client is still broken as it does not encode the '%' character:
{code}
final URL url = new URL(getTransportScheme(), nnAddr.getHostName(),
nnAddr.getPort(), path + '?' + query);
scala> new java.net.URL("http", "localhost", 80, "/+%asdlkf")
res3: java.net.URL = http://localhost:80/+%asdlkf
{code}
So it looks like that we need to fix the client anyway.
> Unable to open webhdfs paths with "+"
> -------------------------------------
>
> Key: HDFS-7816
> URL: https://issues.apache.org/jira/browse/HDFS-7816
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Affects Versions: 2.7.0
> Reporter: Jason Lowe
> Assignee: Kihwal Lee
> Priority: Blocker
> Attachments: HDFS-7816.patch, HDFS-7816.patch
>
>
> webhdfs requests to open files with % characters in the filename fail because
> the filename is not being decoded properly. For example:
> $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def'
> cat: File does not exist: /user/somebody/abc%25def
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)