[
https://issues.apache.org/jira/browse/HTTPCLIENT-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153955#comment-17153955
]
Roy T. Fielding commented on HTTPCLIENT-1995:
---------------------------------------------
Oleg, please listen to what is being reported. The ampersand (&) character is
reserved in RFC1808, RFC2396, and RFC3986. Of those, RFC3986 is the only spec
that currently applies to all Internet-facing software, regardless of its
version or use of java utility libraries. At no time in the entire history of
the Web has it ever been allowed for a client to normalize %26 as &, regardless
of where it appears in a URI.
Your interpretation of what those RFCs say is simply wrong, throughout this
discussion.
It doesn't matter that %26 and & may be interpreted as the same data by a
generator or origin server, because HttpClient is not a generator or origin
server and this specific bug is in automatic normalization.
It doesn't matter that pchar or sub-delims contains "&" because that is how the
specification defines the grammar that allows specific reserved characters to
appear within a component. Those characters are still reserved.
It doesn't matter which RFCs the code claims to conform to, since it doesn't
conform to any of them (and, regardless, that's not how RFCs work). The RFCs
exist to help you write interoperable software. The relevant section in the
relevant RFC describes the correct behavior, as noted by the OP.
There are thousands of implementations of URI parsers and hundreds of
normalizers. None of them agree with your interpretation because it DOES NOT
INTEROPERATE WITH OTHER IMPLEMENTATIONS. It simply cannot work in practice,
regardless of what the RFCs say, because changing the character from %26 to &
will break existing applications on origin servers that do not use your code.
And, no, the authors of the RFCs are not responsible for your desire to
misinterpret them. If you had to say that SO MANY times, maybe it's because you
didn't understand what was written and haven't been listening to those who do.
To be clear, sub-delims are reserved and they appear in the grammar for pchar
because that is how ABNF describes that "& is allowed in the URI path segments
AND is a reserved character". It does not mean what you think it means.
> Percent-encoded ampersand in URI path not preserved
> ---------------------------------------------------
>
> Key: HTTPCLIENT-1995
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1995
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 4.5.8, 4.5.9
> Environment: Linux Mint 19, OpenJDK 8
> Reporter: none_
> Priority: Major
>
> Starting with HttpClient 4.5.8, percent-encoded ampersand characters in URI
> path segments are not preserved any longer but written in decoded form to
> wire due to path normalization performed by URIUtils.rewriteURI(URI,
> HttpHost).
>
> According to RFC 3986 (page 11+), the ampersand character is a delimiter and
> thus needs to be percent-encoded when not used for this purpose. Path
> normalization, as performed by HttpClient v4.5.8+, creates a new URI that is
> not equivalent to the original URI and thus leads to misinterpretation on
> server/receiver side.
> ??URIs that differ in the replacement of a reserved character with its??
> ??corresponding percent-encoded octet are not equivalent. Percent-??
> ??encoding a reserved character, or decoding a percent-encoded octet??
> ??that corresponds to a reserved character, will change how the URI is??
> ??interpreted by most applications??.
>
> A very simple test case is as follows:
> {code:java}
> @Test
> public void testAmpersand() throws Throwable
> {
> final URI uri = new
> URI("http://example.org/some/path%26with%20percent/encoded/segments");
> final URI uri2 = URIUtils.rewriteURI(uri, null);
>
> Assert.assertEquals(uri, uri2);
> }
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]