[ https://issues.apache.org/jira/browse/HTTPCLIENT-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869677#comment-16869677 ]
Julian Reschke edited comment on HTTPCLIENT-1995 at 6/24/19 12:53 PM: ---------------------------------------------------------------------- {noformat} @Test public void testAmpersand() throws Throwable { final URI uri = new URI("http://example.org/some/path%26with%20percent/encoded/segments"); final URI uri2 = URIUtils.rewriteURI(uri, new HttpHost("example.net")); System.out.println(uri); System.out.println(uri2); Assert.assertEquals(uri.getRawPath(), uri2.getRawPath()); } {noformat} outputs {noformat} http://example.org/some/path%26with%20percent/encoded/segments http://example.net/some/path&with%20percent/encoded/segments {noformat} and of course fails. So the host is updated as described in <[https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/utils/URIUtils.html#rewriteURI(java.net.URI,%20org.apache.http.HttpHost])>, but the path gets broken. I fail to see how this can be considered correct (no matter whether you choose to cite RFC 2396 or RFC 3986). EDIT: please ignore above comment. I was wrong, "&" is not reserved in the path component. was (Author: reschke): {noformat} @Test public void testAmpersand() throws Throwable { final URI uri = new URI("http://example.org/some/path%26with%20percent/encoded/segments"); final URI uri2 = URIUtils.rewriteURI(uri, new HttpHost("example.net")); System.out.println(uri); System.out.println(uri2); Assert.assertEquals(uri.getRawPath(), uri2.getRawPath()); } {noformat} outputs {noformat} http://example.org/some/path%26with%20percent/encoded/segments http://example.net/some/path&with%20percent/encoded/segments {noformat} and of course fails. So the host is updated as described in <https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/utils/URIUtils.html#rewriteURI(java.net.URI,%20org.apache.http.HttpHost)>, but the path gets broken. I fail to see how this can be considered correct (no matter whether you choose to cite RFC 2396 or RFC 3986). > Percent-encoded ampersand in URI path not preserved > --------------------------------------------------- > > Key: HTTPCLIENT-1995 > URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1995 > Project: HttpComponents HttpClient > Issue Type: Bug > Components: HttpClient (classic) > Affects Versions: 4.5.8, 4.5.9 > Environment: Linux Mint 19, OpenJDK 8 > Reporter: none_ > Priority: Major > > Starting with HttpClient 4.5.8, percent-encoded ampersand characters in URI > path segments are not preserved any longer but written in decoded form to > wire due to path normalization performed by URIUtils.rewriteURI(URI, > HttpHost). > > According to RFC 3986 (page 11+), the ampersand character is a delimiter and > thus needs to be percent-encoded when not used for this purpose. Path > normalization, as performed by HttpClient v4.5.8+, creates a new URI that is > not equivalent to the original URI and thus leads to misinterpretation on > server/receiver side. > ??URIs that differ in the replacement of a reserved character with its?? > ??corresponding percent-encoded octet are not equivalent. Percent-?? > ??encoding a reserved character, or decoding a percent-encoded octet?? > ??that corresponds to a reserved character, will change how the URI is?? > ??interpreted by most applications??. > > A very simple test case is as follows: > {code:java} > @Test > public void testAmpersand() throws Throwable > { > final URI uri = new > URI("http://example.org/some/path%26with%20percent/encoded/segments"); > final URI uri2 = URIUtils.rewriteURI(uri, null); > > Assert.assertEquals(uri, uri2); > } > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org