Is someone collecting the posts like this one?  Information like this needs
to be indexed.

-Mark

----- Original Message -----
From: "Roberts, Eric" <[EMAIL PROTECTED]>
To: "Tomcat Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, June 11, 2003 5:47 AM
Subject: RE: How to UTF-8 your site.


I would just like to say thanks.

I have a magazine site which I recently upgraded from 3.2 to 4.1.24, and all
quotation marks in the text were being replaced with the dreaded "?" The
prospect of editing manually over 100 pages was not a prospect I was looking
forward to.

Following your advice, everything is now 100%.

Thanks

-----Original Message-----
From: Andoni [mailto:[EMAIL PROTECTED]
Sent: 10 June 2003 13:27
To: Tomcat Users List
Subject: How to UTF-8 your site.


Hello,

I have recently completed the torturous process of translating my web-site
into 16 European languages.  Having had lots of advice from this list and
other sources I have come down to a few conclusions about what a Java /
Tomcat web-site needs in order to fully support UTF-8.

These are:

1.
JSP pages must inlcude the header:

<%@ page
 contentType="text/html; charset=UTF-8"
%>

2.
In the Catalina.bat (windows) catalina.sh (windows)
apache$jakarta_config.com (OpenVMS), file there must be a switch added to
the call to java.exe.  The switch is:

-Dfile.encoding=UTF-8

I cannot find documentation for this environment variable anywhere or what
it actually does but it is essential.

3.
For translation of inputs coming back from the browser there must be a
method that translates from the browser's ISO-8859-1 to UTF-8.  It seems to
me that -1 is used in all regions as I have had people in countries such as
Greece & Bulgaria test this and they always send input back in -1 encoding.
The method which you will use constantly should go something like this:

 /**
  * Convert ISO8859-1 format string (which is the default sent by IE
  * to the UTF-8 format that the database is in.
  */
 public String toUTF8(String isoString)
 {
  String utf8String = null;
  if (null != isoString && !isoString.equals(""))
  {
   try
   {
    byte[] stringBytesISO = isoString.getBytes("ISO-8859-1");
    utf8String = new String(stringBytesISO, "UTF-8");
   }
   catch(UnsupportedEncodingException e)
   {
    // As we can't translate just send back the best guess.
    System.out.println("UnsupportedEncodingException is: " +
e.getMessage());
    utf8String = isoString;
   }
  }
  else
  {
   utf8String = isoString;
  }
  return utf8String;
 }


I have found that these three steps are all that is necessary to make your
site accept any language that UTF-8 can work with.  I extend my thanks to
those of you on the Tomcat users list who helped me find these little gems.

Kind regards,

Andoni.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to