Using Spring's CharacterEncodingFilter did the trick, thanks everyone for the useful suggestions! I set everything to utf-8:
- Tomcat's URIEncoding set to utf-8 - character encoding of forms / html pages set to utf-8 - encoding for ajax requests set to utf-8 - container-encoding / form-encoding parameters of CocoonServlet set to utf-8 - added Spring's CharacterEncodingFilter to web.xml, set encoding to utf-8 with force = true. afaics, encoding works in all places, nice! regards Dennis On Mon, Oct 12, 2009 at 1:32 PM, Bart van der Schans < [email protected]> wrote: > In my experience the biggest problem with encoding are the browsers, > especially the older ones. For example, most browsers handle URI > encoding differently. Some issues are historically and some issues are > just plain wrong (for example some browsers mix encodings: they do the > url part in latin-1 and the query part in utf-8). The only way I > usually get everything working is to *force* everywhere where it's > possible everything to utf-8. This includes setting the URIEncoding in > tomcat/jetty, adding the spring encoding filter, setting the jsp page > encoding in the web.xml, setting the meta tags in html, specifying > form encodings for forms, etc. etc. > > On a side note: don't enable mod_php when your using mod_jk. It will > cause your urls to get double encoded :-( > > Regards, > Bart > > > On Fri, Oct 9, 2009 at 9:14 AM, Dennis Dam <[email protected]> wrote: > >> > >> > >> According to a javadoc document of spring framework, it says, "current > >> browsers typically do not set a character encoding even if specified > >> in the HTML page or form." [1] > >> So, I think we need to assume that the request encoding is one > >> specific one. Currently we have a good alternative one: UTF-8. > >> Before UTF-8 is not popular one, I used to determine the encoding > >> based on the user's language. (e.g., "ko" : KSC5601 or EUC-KR, "en" : > >> ISO-8859-1, "ja" : Shift_JIS, ...) > >> However, I think you don't have any problem with the assumption of > >> UTF-8 today in most cases. > >> > >> [1] > >> > http://static.springsource.org/spring/docs/2.5.x/api/org/springframework/web/filter/CharacterEncodingFilter.html > >> > >> > > Yes, I agree you have to assume something, but I guess the assumptions > > differ according to how requests are submitted. The browser sometimes > sends > > it in utf-8 (GET methods), and sometimes in latin-1 (POST method). In > case > > of the POST parameters, the browser didn't set an encoding (encoding is > > null), in case of the GET parameters, the correct character set is set by > > the browser (encoding is utf-8). A third case is POST submits executed > > javascript ("ajax"), which also set the correct character set. > > > > I did a little experiment with a test filter, which converts parameters > from > > latin-1 to utf-8, in case the encoding is null, and then prints those > > parameters, with some debug code. POST parameters submitted from regular > > forms (no ajax) are shown correctly once they are converted to utf-8. > > > > > > > >> > > >> > Ofcourse I can fix it with a workaround: implement a filter that > converts > >> > only POST parameters, if the incoming encoding is NULL or anything > else > >> than > >> > utf-8. But I'd rather like to solve the cause of the problem :) > >> > >> You don't have to implement new filter. As Ard mentioned, you can use > >> "CharacterEncodingFilter" of Spring Framework. If > >> request.setCharacterEncoding() has been ever invoked, then > >> request.getParameter() returns a converted string from the container > >> encoding to target encoding. The "CharacterEncodingFilter" is doing > >> this. > >> > >> > > I'm not sure if the filter will work in my case .. it only sends the > > encoding if the request encoding is *not* null. In the case of FORM > posts, > > the encoding is null. I will give a try though! > > > > > >> > >> Regards, > >> > >> Woonsan > >> > >> > > >> > regards > >> > Dennis > >> > > >> > > >> > On Wed, Oct 7, 2009 at 2:24 PM, Dennis Dam <[email protected]> > wrote: > >> > > >> >> > >> >> > >> >> On Wed, Oct 7, 2009 at 2:09 PM, Bartosz Oudekerk < > >> [email protected]>wrote: > >> >> > >> >>> Ard Schrijvers wrote: > >> >>> > >> >>>> 1. added system property -Dfile.encoding=UTF-8 to catalina.sh > >> >>>>> 2. added URIEncoding=utf-8 to the 8080 connector in > conf/server.xml > >> >>>>> 3. the container-encoding init parameter for the Cocoon servlet is > >> set > >> >>>>> to > >> >>>>> "ISO-8859-1". > >> >>>>> 4. the form-encoding init parameter for the Cocoon servlet is set > to > >> >>>>> "utf-8" > >> >>>>> > >> >>>> > >> >>>> why is (3) not utf-8?? > >> >>>> > >> >>> > >> >>> Because if it's set to UTF-8, then things tend to get doubly > encoded. > >> >>> > >> >> > >> >> > >> >> not exactly.. it depends on the value of "form-encoding", if > >> >> container-encoding + form-encoding are identical (e.g. both set to > >> utf-8), > >> >> then Cocoon does not perform encoding conversions. > >> >> > >> >> Why is (3) not set to UTF-8?? Because then the form POST encoding is > >> broken > >> >> :)) With ISO the form GETs are broken .. :) > >> >> > >> >> > >> >>> Regards, > >> >>> -- > >> >>> Bartosz Oudekerk > >> >>> > .---------------------------------.-----------------------------------. > >> >>> | Hippo B.V. | Hippo USA Inc. > | > >> >>> | Oosteinde 11 | 101 H Street, suite Q Petaluma > CA | > >> >>> | 1017 WT Amsterdam | 94952-5100 San Francisco > | > >> >>> | The Netherlands | United States > | > >> >>> | Tel +31 (0)20 5224466 | +1 (707) 773-4646 > | > >> >>> > +---------------------------------+-----------------------------------+ > >> >>> | [email protected] | http://www.onehippo.com > >> | > >> >>> > >> >>> > `---------------------------------^-----------------------------------' > >> >>> ******************************************** > >> >>> Hippocms-dev: Hippo CMS development public mailinglist > >> >>> > >> >>> Searchable archives can be found at: > >> >>> MarkMail: http://hippocms-dev.markmail.org > >> >>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > >> >>> > >> >>> > >> >> > >> >> > >> >> -- > >> >> Hippo B.V. - Amsterdam > >> >> Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 > >> >> > >> >> Hippo USA Inc. - San Francisco > >> >> 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 > >> >> ----------------------------------------------------------------- > >> >> http://www.onehippo.com - [email protected] > >> >> ----------------------------------------------------------------- > >> >> > >> >> > >> > > >> > > >> > -- > >> > Hippo B.V. - Amsterdam > >> > Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 > >> > > >> > Hippo USA Inc. - San Francisco > >> > 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 > >> > ----------------------------------------------------------------- > >> > http://www.onehippo.com - [email protected] > >> > ----------------------------------------------------------------- > >> > ******************************************** > >> > Hippocms-dev: Hippo CMS development public mailinglist > >> > > >> > Searchable archives can be found at: > >> > MarkMail: http://hippocms-dev.markmail.org > >> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > >> > > >> > > >> > >> > >> > >> -- > >> [email protected] www.onehippo.com > >> EUROPE • AMSTERDAM - Hippo B.V. Oosteinde 11 1017 WT Amsterdam > >> +31(0)20-5224466 > >> NORTH AMERICA • SAN FRANCISCO - Hippo USA Inc. 185 H Street, Suite B > >> Petaluma CA 94952 +1 (877) 414-4776 > >> ******************************************** > >> Hippocms-dev: Hippo CMS development public mailinglist > >> > >> Searchable archives can be found at: > >> MarkMail: http://hippocms-dev.markmail.org > >> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > >> > >> > > > > > > -- > > Hippo B.V. - Amsterdam > > Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 > > > > Hippo USA Inc. - San Francisco > > 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 > > ----------------------------------------------------------------- > > http://www.onehippo.com - [email protected] > > ----------------------------------------------------------------- > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > -- > Hippo B.V. - Amsterdam > Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 > > Hippo USA Inc. - San Francisco > 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 > ----------------------------------------------------------------- > http://www.onehippo.com - [email protected] > ----------------------------------------------------------------- > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > -- Hippo B.V. - Amsterdam Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 Hippo USA Inc. - San Francisco 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 ----------------------------------------------------------------- http://www.onehippo.com - [email protected] ----------------------------------------------------------------- ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
