Hi, I have implemented this support in trunk (see SLING-1998 [1]) and described it on the Request Parameter Handling page [2].
Regards Felix [1] https://issues.apache.org/jira/browse/SLING-1998 [2] http://sling.apache.org/site/request-parameters.html Am Freitag, den 25.02.2011, 16:12 +0000 schrieb Felix Meschberger: > Hi, > > The problem is that browsers tend to not tell the character encoding > used when posting data ... Don't ask me why ;-) > > So we have to do guessing, something I really do not like. > > But it looks like browsers send POST data in the same encoding as the > form was received as. So if the form is received as UTF-8 encoded, > browsers send back encoded in UTF-8. > > Now, how does Sling know what encoding has been used to send the form ? > Short answer: It cannot know. > > Hence the _charset_ request parameter. > > But listening to our clients and users and understanding that most of > the time UTF-8 is used anyway, how about this solution: > > * We stick with the _charset_ parameter. Whatever that parameter > conveys is used to decode parameters. > * If the parameter does not exist, we support a new configuration > option defining the default encoding to be used. > * If the configuration option is also missing, we default to the > same value as we do today; which is ISO-8859-1 > > Of course the configuration option would not be set by default (for > backwards compatibility reasons). > > Would that help your case ? > > Regards > Felix > > Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee: > > according to: > > http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29 > > request.getCharacterEncoding() should return " the name of the character > > encoding used in the body of this request. ". > > > > But request.getCharacterEncoding() always seems to return ISO-8859-1. > > For example, my html.jsp looks like: > > <%@ page language="java" contentType="text/html; charset=UTF-8" > > pageEncoding="UTF-8"%> > > ... > > <form method="POST" action="/some/path" > > accept-charset="utf-8" > > enctype="application/x-www-form-urlencoded; charset=utf-8"> > > <input type="hidden" name="_charset_" value="UTF-8" /> > > <input type="submit" value="Save" /> > > ... > > > > Then I would expect request.getCharacterEncoding() (from POST.jsp) to > > return "UTF-8". But it still returns "ISO-8859-1". > > > > Is this intended? > > > > >From sling documentation: > > http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding > > I don't get this part: "This identity transformation happens to generate > > strings as the original data was generated with ISO-8859-1 encoding." > > > > As long as I set _charset_ to the encoding of the rendered page (with > > <form>), I don't have a problem. But, I was wondering if > > .getCharacterEncoding() should be set to whatever request body was encoded > > as, not what sling used to perform "identity transform" with. > > > > Also, wouldn't it be better if _charset_ is missing from request, it's > > automatically set to request body encoding? Or, browsers don't send request > > body encoding information? > > > > Thanks. > > Sam > >
