On Wed, 11 Jun 2003 19:19, Mark F wrote: > Is someone collecting the posts like this one? Information like this needs > to be indexed. > > -Mark
It will get remembered now, I did see it before and earmarked it but now it has FAQ in the subject line it will get recorded. :) If you post anything worthy of including in a FAQ or howto please put 'FAQ' somewhere in the subject line and that will ensure it gets picked up. Regards, -- Jason Bainbridge http://jblinux.org > ----- Original Message ----- > From: "Roberts, Eric" <[EMAIL PROTECTED]> > To: "Tomcat Users List" <[EMAIL PROTECTED]> > Sent: Wednesday, June 11, 2003 5:47 AM > Subject: RE: How to UTF-8 your site. > > > I would just like to say thanks. > > I have a magazine site which I recently upgraded from 3.2 to 4.1.24, and > all quotation marks in the text were being replaced with the dreaded "?" > The prospect of editing manually over 100 pages was not a prospect I was > looking forward to. > > Following your advice, everything is now 100%. > > Thanks > > -----Original Message----- > From: Andoni [mailto:[EMAIL PROTECTED] > Sent: 10 June 2003 13:27 > To: Tomcat Users List > Subject: How to UTF-8 your site. > > > Hello, > > I have recently completed the torturous process of translating my web-site > into 16 European languages. Having had lots of advice from this list and > other sources I have come down to a few conclusions about what a Java / > Tomcat web-site needs in order to fully support UTF-8. > > These are: > > 1. > JSP pages must inlcude the header: > > <%@ page > contentType="text/html; charset=UTF-8" > %> > > 2. > In the Catalina.bat (windows) catalina.sh (windows) > apache$jakarta_config.com (OpenVMS), file there must be a switch added to > the call to java.exe. The switch is: > > -Dfile.encoding=UTF-8 > > I cannot find documentation for this environment variable anywhere or what > it actually does but it is essential. > > 3. > For translation of inputs coming back from the browser there must be a > method that translates from the browser's ISO-8859-1 to UTF-8. It seems to > me that -1 is used in all regions as I have had people in countries such as > Greece & Bulgaria test this and they always send input back in -1 encoding. > The method which you will use constantly should go something like this: > > /** > * Convert ISO8859-1 format string (which is the default sent by IE > * to the UTF-8 format that the database is in. > */ > public String toUTF8(String isoString) > { > String utf8String = null; > if (null != isoString && !isoString.equals("")) > { > try > { > byte[] stringBytesISO = isoString.getBytes("ISO-8859-1"); > utf8String = new String(stringBytesISO, "UTF-8"); > } > catch(UnsupportedEncodingException e) > { > // As we can't translate just send back the best guess. > System.out.println("UnsupportedEncodingException is: " + > e.getMessage()); > utf8String = isoString; > } > } > else > { > utf8String = isoString; > } > return utf8String; > } > > > I have found that these three steps are all that is necessary to make your > site accept any language that UTF-8 can work with. I extend my thanks to > those of you on the Tomcat users list who helped me find these little gems. > > Kind regards, > > Andoni. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]