On Mon, 25 Nov 2002, Bodycombe, Andrew wrote:

> Date: Mon, 25 Nov 2002 15:29:29 +0000
> From: "Bodycombe, Andrew" <[EMAIL PROTECTED]>
> Reply-To: Tomcat Users List <[EMAIL PROTECTED]>
> To: "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>
> Subject: RE: Discrepancies between servlets and JSP on tomcat in handling
>     UTF-8 ?
>
> Interesting. I have encountered a similiar problem.
>
> I have a servlet that connects to an XML application. The response from the
> application is read using a SAX reader, and I encountered an error if the
> response contained any non-ASCII characters (� and � in particular, as I am
> currently working in Germany)
>
> I did a little investigation, and found that the content type was text/xml;
> charset=ISO-5591-1 and the xml tag was
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> Now the XML I received in my servlet was ISO-5591-1 and not UTF-8, so I have
> contacted the application developers to say "Please fix your application
> because the XML I receive is not encoded as UTF-8, it is ISO-5591-1."
>
> I have a work-round, where I read the input using an ISO-8859-1
> InputStreamReader, and get the SAXReader to use this as the input. This is
> working fine as a temporary measure.
>
> The original message in this thread suggests to me that this could actually
> be a tomcat problem and not necessarily a problem with the application I
> connect to. There is clearly a discrepancy between the "encoding" type and
> the "charset", and the SAX reader is using the value of the encoding
> attribute to read the text.
>
> Is tomcat doing something with the HTTP text, possibly converting it from
> UTF-8 into ISO-5591-1?
>

Yes.

The Servlet and JSP specs require that the character encoding (if you are
writing a character-based response) default to ISO-8859-1 unless you
explicitly tell the container otherwise.

To set the character encoding in a servlet, you add the charset modifier
to the content type:

  response.setContentType("text/html;charset=UTF-8");

In a JSP page, you set the character encoding of the response in the <%@
page %> directive:

  <%@ page contentType="text/html;charset=UTF-8" %>

In JSP 2.0, you'll have the ability to declare (in a global configuration
file) that "pages that match this URL pattern have this encoding".  For
JSP 1.2 (i.e. Tomcat 4.x) you do not have that option, and must declare it
inside the page itself.

> I confess, I've not tried this servlet out in other servlet containers, just
> tomcat version 4.1.12, running on Windows
>
> Andy
>

Craig


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to