[
https://issues.apache.org/jira/browse/HTTPCLIENT-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Kalnichevski resolved HTTPCLIENT-1990.
-------------------------------------------
Resolution: Invalid
> URIUtils.rewriteURI manges unicode characters
> ---------------------------------------------
>
> Key: HTTPCLIENT-1990
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1990
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpCache
> Affects Versions: 4.5.8
> Reporter: Nicholas Wilson
> Priority: Minor
>
> The following test case illustrates a problem with URIUtils that I have
> encountered:
> {code:java}
> public class Main {
> public static void main(String[] args) throws Exception {
> URI uri = UriComponentsBuilder.fromUriString("https://host/path")
> .pathSegment("üñîçøðé")
> .build()
> .toUri();
> System.out.printf("rawPath = %s\n", uri.getRawPath());
> System.out.printf("path = %s\n", uri.getPath());
> uri = URIUtils.rewriteURI(uri, null,
> URIUtils.DROP_FRAGMENT_AND_NORMALIZE);
> System.out.printf("rawPath = %s\n", uri.getRawPath());
> System.out.printf("path = %s\n", uri.getPath());
> }
> }
> {code}
> The issue was encontered, since previous versions of httpclient didn't
> perform the path normalisation (the main caller is ProtocolExec in the HTTP
> client), and effectively only did URIUtils.DROP_FRAGMENT, so users who
> upgrade will get the new normalisation feature unexpectedly.
> The bug exhibited by URIUtils.rewriteURI is actually caused by
> URLEncodedUtils.urlDecode (inside URIBuilder's ctor, which calls
> URIBuilder.parsePath), which does something truly nasty. It takes a String (a
> logical sequence of Unicode code points), casts it to a CharBuffer, then
> iterates over it, slicing the chars to bytes! Strange, but true.
> Unicode characters in a java.net.URI are legal, as far as I can tell, and
> should be simply escaped as percent-encoded UTF-8 bytes as returned by
> URI.getRawPath - but! - not when returned unescaped by URI.getPath, which is
> what URIUtils.rewriteURI uses.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]