On Fri, 2004-05-28 at 22:18, Jasper Michalczik wrote: > Dear Reinhard, dear Cocoon-users, > > I was asked to give a short explanation on how to use Cocoon for > non-roman languages - especially Arabic - which should be of use for > Chinese as well. > > I'm not too firm in using Cocoon, so please feel free to correct or > extend this. > > > All files have to be saved as utf-8, so make sure to add/change the > first line of your xml/xsl-files: > > <?xml version="1.0" encoding="UTF-8"?>
This isn't a requirement, it can be any encoding you like as long as it supports the characters you need. It can be a different encoding then the one being used to send the page to the browser. UTF-8 is a good choice though. > In sitemap.xmap I added the following to each serializer: > > <map:serializer logger=...> > <encoding>UTF-8</encoding> > </map:serializer> > > This adds the following META-Tag to the serialized document: > > <META http-equiv="Content-Type" content="text/html; > charset=UTF-8"> yep, but it only does it if your page has already a html/head tag in it. > > Then I set the following parameters in web.xml... > > <init-param> > <param-name>container-encoding</param-name> > <param-value>ISO-8859-1</param-value> > </init-param> > <init-param> > <param-name>form-encoding</param-name> > <param-value>UTF-8</param-value> > </init-param> > > ... to make sure the forms are processed correctly. > > On the client side at least Windows 2000 (I don't know about Linux or > Mac) must be used with the keyboard settings set up to allow > Arabic/Chinese typing. If you only need to display non-roman characters, > this also works with any system and a browser that supports > Unicode-display. IE5+ for example downloads the necessary fonts > automatically when needed. > > I remember having some troubles using Tomcat 4.1.29, but 4.1.18 works > fine. This is because of the following issue: http://issues.apache.org/bugzilla/show_bug.cgi?id=26997 > I don't have any experiences with any other version or > servlet-container. > > > I only can't explain why the container-encoding in web.xml has to be set > to ISO-8859-1. If anybody knows about this, please add it to this text. > Any other setting I tried to use didn't work out. It has to be ISO-8859-1, always. This is because the servlet specification requires that request parameters are by default decoded as ISO-8859-1 (regardless of the default platform encoding). The only reason I can imagine this is configurable at all is to work around buggy servlet containers. More background on all this is also available at: http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding > > > I hope I could make a small contribution to the growing > cocoon-community... sure! > > > Jasper Michalczik > -- Bruno Dumon http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center [EMAIL PROTECTED] [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
