[
https://issues.apache.org/jira/browse/HTTPCLIENT-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904252#action_12904252
]
Jared Jacobs commented on HTTPCLIENT-884:
-----------------------------------------
If you are sending simple ASCII content, then you can try specifying "US-ASCII"
or "ISO-8859-1" as the second argument to the UrlEncodedFormEntity constructor.
Oleg, you may want to provide a way in UrlEncodedFormEntity to exclude the
"charset" parameter here. Apparently it's still common for servers not to parse
media type parameters correctly. For backwards compatibility, browser vendors
decided not to specify the charset in the Content-Type header (even though it's
arguably most correct and a common practice -- just search the web), but
instead to give authors the option of sending the character set as an extra
"_charset_" parameter in the request's body, and that practice made it into the
HTML 5 spec:
http://www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data
Unlike most media types, which are "owned" by IANA, the browser vendors and W3C
own the non-standard "application/x-www-form-urlencoded".
> Charset omitted from UrlEncodedFormEntity Content-Type header
> -------------------------------------------------------------
>
> Key: HTTPCLIENT-884
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-884
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient
> Affects Versions: 4.0 Final
> Environment: all
> Reporter: Jared Jacobs
> Priority: Minor
> Fix For: 4.0.1, 4.1 Alpha1
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> UrlEncodedFormEntity sets the Content-Type header to:
> "application/x-www-form-urlencoded"
> It should set the header to:
> "application/x-www-form-urlencoded; charset=" + charset
> As a result, content can be misinterpreted by the recipient (e.g. if the
> entity content includes multibyte Unicode characters encoded with the "UTF-8"
> charset).
> For a correct example of specifying the charset in the Content-Type header,
> see StringEntity.java.
> Here's the fix:
> public UrlEncodedFormEntity (
> final List <? extends NameValuePair> parameters,
> final String encoding) throws UnsupportedEncodingException {
> super(URLEncodedUtils.format(parameters, encoding), encoding);
> - setContentType(URLEncodedUtils.CONTENT_TYPE);
> + setContentType(URLEncodedUtils.CONTENT_TYPE + HTTP.CHARSET_PARAM +
> + (encoding != null ? encoding : HTTP.DEFAULT_CONTENT_CHARSET));
> }
> public UrlEncodedFormEntity (
> final List <? extends NameValuePair> parameters) throws
> UnsupportedEncodingException {
> - super(URLEncodedUtils.format(parameters,
> HTTP.DEFAULT_CONTENT_CHARSET),
> - HTTP.DEFAULT_CONTENT_CHARSET);
> - setContentType(URLEncodedUtils.CONTENT_TYPE);
> + this(parameters, HTTP.DEFAULT_CONTENT_CHARSET);
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]