[
https://issues.apache.org/jira/browse/SLING-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072683#comment-13072683
]
Eric Norman commented on SLING-2143:
------------------------------------
Hi Stephen,
I believe that since the operation is now using the RequestParameter object to
get the content, the _charset_ parameter handling should already be converting
the values to to the requested charset per what is documented in [1]. Do you
have a scenario where this does not work? Can you provide a test case?
1. http://sling.apache.org/site/request-parameters.html
> SlingPostServlet ImportOperation :content parameter overrides _charset_
> parameter with system default encoding
> --------------------------------------------------------------------------------------------------------------
>
> Key: SLING-2143
> URL: https://issues.apache.org/jira/browse/SLING-2143
> Project: Sling
> Issue Type: Bug
> Components: Servlets
> Affects Versions: Servlets Post 2.1.0
> Environment: OS: Windows 2k8, Windows 7
> Web Servers: Tomcat 6
> Sling 6
> Reporter: Stephen Sanchez
> Assignee: Eric Norman
> Fix For: Servlets Post 2.1.2
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The ImportOperation on the SlingPostServlet (2.1.0 to Trunk) does not support
> encoding specified in the form _charset_ parameter on a POST request.
> REPRODUCTION STEPS:
> Create a POST request using the SlingPostServlet and ImportOperation:
> curl -F":operation=import" -F"_charset_=UTF-8" -F":contentType=json"
> -F":replace=true" -F":replaceProperties=true" -F":content={'latin':'øµå',
> 'chinese':'玄牛'}" http://admin:admin@localhost:8080
> PROPOSED SOLUTION:
> In ImportOperation.java, line 137 (as of 2.1.0 tag):
> contentStream = new ByteArrayInputStream(content.getBytes());
> This line will take the :content parameter on a request, properly encoded
> (ex. UTF-8) and get the bytes using system-level encoding. This causes all
> unicode characters in the content to be encoded, on windows, with the wrong
> encoding.
> The simple fix, which resolves the issue and allows UTF-8 encoding across all
> operating systems:
> contentStream = new ByteArrayInputStream(content.getBytes("UTF-8"));
> My proposed fix, is to support the form parameter _charset_, since :content
> is a parameter on the form in the post request:
> RequestParameter encodingParam = request.getRequestParameter("_charset_");
> byte[] contentBytes;
> if (encodingParam != null && encodingParam.getString() != null) {
> contentBytes = content.getBytes(encodingParam.getString());
> } else {
> contentBytes = content.getBytes();
> }
> contentStream = new ByteArrayInputStream(contentBytes);
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira