Hi Bruno: Thanks for the answer.
Currently, I have no time to test it. I know this is a issue very frecuent now, when people realize the right encoding is UTF-8. Here is a link from Tomcat: http://jakarta.apache.org/tomcat/faq/misc.html#utf8 Best Regards, Antonio Gallardo Bruno Dumon dijo: > On Sat, 2004-05-29 at 12:26, Antonio Gallardo wrote: >> Bruno Dumon dijo: >> >> I only can't explain why the container-encoding in web.xml has to be >> set >> >> to ISO-8859-1. If anybody knows about this, please add it to this >> text. >> >> Any other setting I tried to use didn't work out. >> > >> > It has to be ISO-8859-1, always. This is because the servlet >> > specification requires that request parameters are by default decoded >> as >> > ISO-8859-1 (regardless of the default platform encoding). The only >> > reason I can imagine this is configurable at all is to work around >> buggy >> > servlet containers. >> > >> > More background on all this is also available at: >> > >> > http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding >> >> I never saw the abovelinked page before. > > It's there since 13/3/2003 and its URL has been dropped on this list > multiple times since then. > > I'd like to move (a subset of) that info into the standard Cocoon docs, > but first I'd like to see the Tomcat issue resolved. > >> But for more than a year I have >> this set is web.xml: >> >> <init-param> >> <param-name>container-encoding</param-name> >> <param-value>utf-8</param-value> >> </init-param> >> >> <init-param> >> <param-name>form-encoding</param-name> >> <param-value>utf-8</param-value> >> </init-param> >> >> In the site map we are using this HTML 4.01 serializer component: >> >> <map:serializer name="html" ....> >> <doctype-public>-//W3C//DTD HTML 4.01 >> Transitional//EN</doctype-public> >> <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system> >> <encoding>ISO-8859-1</encoding> >> <buffer-size>1024</buffer-size> >> <omit-xml-declaration>true</omit-xml-declaration> >> </map:serializer> >> >> With this configuration we are able to connect to a PostgreSQL database >> UTF-8 encoded. >> >> Hope this help. > > oops! that's a quite wrong configuration you have there. If you thought > you were using UTF-8 for the communication with your browser, then I'll > have to dissapoint you. You're using ISO-8859-1. Specifying UTF-8 twice > in the web.xml is the same as specifying nothing, because it negates the > effect. The servlet container decodes the request parameters as > ISO-8859-1, and then cocoon does this: > > new String(value.getBytes("UTF-8"), "UTF-8"); > > which is an effectless operation (but does burn a lot of CPU cycles, > you're better of disabling those parameters in the web.xml if you're > just using ISO-8859-1). > > Note that the encoding used to connect to your database (and how your > database stores the data internally) are completely seperate issues from > what encoding is used to communicate between webserver and browser (if > and how this needs to be configured depends on the database product). > > -- > Bruno Dumon http://outerthought.org/ > Outerthought - Open Source, Java & XML Competence Support Center > [EMAIL PROTECTED] [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
