Joakim Erdfelt created HTTPCLIENT-2400:
------------------------------------------

             Summary: URLEncodedUtils.encodeFormFields() has incorrect javadoc
                 Key: HTTPCLIENT-2400
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2400
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient (classic)
    Affects Versions: 5.5.1
            Reporter: Joakim Erdfelt


The javadoc for URLEncodedUtils.encodeFormFields() says ...

> Encode/escape www-url-form-encoded content.
> Uses the URLENCODER set of characters, rather than the UNRESERVED set; this 
> is for
> compatibilty with previous releases, URLEncoder.encode() and most browsers.

This method is not compatible with URLEncoder.encode() with non-UTF-8 charsets.

If we take a Japanese character and encode it with 
URLEncodedUtils.encodeFormFields() and again with URLEncoder.encode() we get 
different results.

I'll use the following letter ...
KATAKANA LETTER HO: ホ
https://unicodeplus.com/U+30DB

Using URLEncodedUtils.encodeFormFields("ホ", Charset.forName("Shift_JIS"))
Result: "%83z"

Using java's URLEncoder.encode("ホ", Charset.forName("Shift_JIS")) 
Result: "%83%7B"

The result from URLEncoder.encode() is actually correct, despite the "%7B" 
being part of the UNRESERVED set.

Interestingly, if you attempt to use java's URLDecoder against the format 
URLEncodedUtils produces you get replacement characters.

Example, with jshell ...


{code}
$ jshell
|  Welcome to JShell -- Version 17.0.15
|  For an introduction type: /help intro

jshell> var shiftJisCharset = java.nio.charset.Charset.forName("Shift-JIS")
shiftJisCharset ==> Shift_JIS

jshell> var result = URLEncoder.encode("ホ", shiftJisCharset)
result ==> "%83%7A"

jshell> var result = URLDecoder.decode("%83%7A", shiftJisCharset)
result ==> "ホ"

jshell> var result = URLDecoder.decode("%83z", shiftJisCharset)
result ==> "�z"

{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to