[
https://issues.apache.org/jira/browse/HTTPCLIENT-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152833#comment-17152833
]
Mark Mielke commented on HTTPCLIENT-1995:
-----------------------------------------
This comes back to what I mentioned prior... if this was my project, I wouldn't
run it this way. But, it's not my project. So, I can't say how it is run. But,
these things are concerns:
1. Long-standing behaviour that seems self-evidence, that people rely on,
suddenly breaks in a patch release, and instead of it getting acknowledge as a
change in behaviour, it gets stated as "ok", because it is RFC-compliant. How
can a downstream project have confidence with an upstream project, when a patch
releases are used to introduce major changes to behaviour?
2. Different interpretation of what the RFC says (see above for varied
interpretation of "reserved" and other terms), leading to the conclusion that
correct behaviour is the exact opposite of what the text reads. It's hard for
me to believe that anybody believes it is ok to normalize reserved characters,
and it's especially hard for me to believe anybody is justifying it after the
specification is found to specifically say it should not be done.
But, I think I have to check out here. This issue is in the same place as it
was in 2019. We're talking past each other, and at some point - those of us who
think otherwise just have to move on. I think that is what other projects have
done - they have either bypassed, or removed the broken normalization from
their code paths.
> Percent-encoded ampersand in URI path not preserved
> ---------------------------------------------------
>
> Key: HTTPCLIENT-1995
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1995
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 4.5.8, 4.5.9
> Environment: Linux Mint 19, OpenJDK 8
> Reporter: none_
> Priority: Major
>
> Starting with HttpClient 4.5.8, percent-encoded ampersand characters in URI
> path segments are not preserved any longer but written in decoded form to
> wire due to path normalization performed by URIUtils.rewriteURI(URI,
> HttpHost).
>
> According to RFC 3986 (page 11+), the ampersand character is a delimiter and
> thus needs to be percent-encoded when not used for this purpose. Path
> normalization, as performed by HttpClient v4.5.8+, creates a new URI that is
> not equivalent to the original URI and thus leads to misinterpretation on
> server/receiver side.
> ??URIs that differ in the replacement of a reserved character with its??
> ??corresponding percent-encoded octet are not equivalent. Percent-??
> ??encoding a reserved character, or decoding a percent-encoded octet??
> ??that corresponds to a reserved character, will change how the URI is??
> ??interpreted by most applications??.
>
> A very simple test case is as follows:
> {code:java}
> @Test
> public void testAmpersand() throws Throwable
> {
> final URI uri = new
> URI("http://example.org/some/path%26with%20percent/encoded/segments");
> final URI uri2 = URIUtils.rewriteURI(uri, null);
>
> Assert.assertEquals(uri, uri2);
> }
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]