On Sat, 2004-05-29 at 12:26, Antonio Gallardo wrote: > Bruno Dumon dijo: > >> I only can't explain why the container-encoding in web.xml has to be set > >> to ISO-8859-1. If anybody knows about this, please add it to this text. > >> Any other setting I tried to use didn't work out. > > > > It has to be ISO-8859-1, always. This is because the servlet > > specification requires that request parameters are by default decoded as > > ISO-8859-1 (regardless of the default platform encoding). The only > > reason I can imagine this is configurable at all is to work around buggy > > servlet containers. > > > > More background on all this is also available at: > > > > http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding > > I never saw the abovelinked page before.
It's there since 13/3/2003 and its URL has been dropped on this list multiple times since then. I'd like to move (a subset of) that info into the standard Cocoon docs, but first I'd like to see the Tomcat issue resolved. > But for more than a year I have > this set is web.xml: > > <init-param> > <param-name>container-encoding</param-name> > <param-value>utf-8</param-value> > </init-param> > > <init-param> > <param-name>form-encoding</param-name> > <param-value>utf-8</param-value> > </init-param> > > In the site map we are using this HTML 4.01 serializer component: > > <map:serializer name="html" ....> > <doctype-public>-//W3C//DTD HTML 4.01 Transitional//EN</doctype-public> > <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system> > <encoding>ISO-8859-1</encoding> > <buffer-size>1024</buffer-size> > <omit-xml-declaration>true</omit-xml-declaration> > </map:serializer> > > With this configuration we are able to connect to a PostgreSQL database > UTF-8 encoded. > > Hope this help. oops! that's a quite wrong configuration you have there. If you thought you were using UTF-8 for the communication with your browser, then I'll have to dissapoint you. You're using ISO-8859-1. Specifying UTF-8 twice in the web.xml is the same as specifying nothing, because it negates the effect. The servlet container decodes the request parameters as ISO-8859-1, and then cocoon does this: new String(value.getBytes("UTF-8"), "UTF-8"); which is an effectless operation (but does burn a lot of CPU cycles, you're better of disabling those parameters in the web.xml if you're just using ISO-8859-1). Note that the encoding used to connect to your database (and how your database stores the data internally) are completely seperate issues from what encoding is used to communicate between webserver and browser (if and how this needs to be configured depends on the database product). -- Bruno Dumon http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center [EMAIL PROTECTED] [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
