[
https://issues.apache.org/jira/browse/HDFS-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904735#comment-16904735
]
Masatake Iwasaki commented on HDFS-14423:
-----------------------------------------
https://tools.ietf.org/html/rfc3986#section-3.3
{quote}
URI producing applications often use the reserved characters allowed in a
segment to delimit scheme-specific or dereference-handler-specific
subcomponents. For example, the semicolon (";") and equals ("=") reserved
characters are often used to delimit parameters and parameter values applicable
to that segment. The comma (",") reserved character is often used for similar
purposes. For example, one URI producer might use a segment such as
"name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another
might use a segment such as "name,1.1" to indicate the same. Parameter types
may be defined by scheme-specific semantics, but in most cases the syntax of a
parameter is specific to the implementation of the URI's dereferencing
algorithm.
{quote}
{noformat}
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
...
path = path-abempty ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characters
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>
segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
{noformat}
> Percent (%) and plus (+) characters no longer work in WebHDFS
> -------------------------------------------------------------
>
> Key: HDFS-14423
> URL: https://issues.apache.org/jira/browse/HDFS-14423
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Affects Versions: 3.2.0, 3.1.2
> Environment: Ubuntu 16.04, but I believe this is irrelevant.
> Reporter: Jing Wang
> Assignee: Wei-Chiu Chuang
> Priority: Major
> Attachments: HDFS-14423.001.patch
>
>
> The following commands with percent (%) no longer work starting with version
> 3.1:
> {code:java}
> $ hadoop/bin/hdfs dfs -touchz webhdfs://localhost/%
> $ hadoop/bin/hdfs dfs -cat webhdfs://localhost/%
> cat: URLDecoder: Incomplete trailing escape (%) pattern
> {code}
> Also, plus (+ ) characters get turned into spaces when doing DN operations:
> {code:java}
> $ hadoop/bin/hdfs dfs -touchz webhdfs://localhost/a+b
> $ hadoop/bin/hdfs dfs -mkdir webhdfs://localhost/c+d
> $ hadoop/bin/hdfs dfs -ls /
> Found 4 items
> -rw-r--r-- 1 jing supergroup 0 2019-04-12 11:20 /a b
> drwxr-xr-x - jing supergroup 0 2019-04-12 11:21 /c+d
> {code}
> I can confirm that these commands work correctly on 2.9 and 3.0. Also, the
> usual hdfs:// client works as expected.
> I suspect a relation with HDFS-13176 or HDFS-13582, but I'm not sure what the
> right fix is. Note that Hive uses % to escape special characters in partition
> values, so banning % might not be a good option. For example, Hive will
> create a paths like {{table_name/partition_key=%2F}} when
> {{partition_key='/'}}.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]