In Tomcat 4 you can configure a filter.
<filter>
<filter-name>SetCharacterEncodingFilter</filter-name>
<filter-class>filters.SetCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
see "webapps\examples\WEB-INF\classes\filters\SetCharacterEncodingFilter.java"
If your Form Header looks like this.
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
and you submit this Form, IE and Netscape send this Form UTF-8 encoded but does
not set the encoding of the Request to UTF-8. If the encoding in the Request is
not set, the Server uses the default,ISO-8859-1, to decode the request. In
tomcat you can change the default using the SetCharacterEncodingFilter. But then
you have to use UTF-8 on all your forms, you can not mix encodings.
Volker
|--------+------------------------->
| | Mathias |
| | Br�kelmann |
| | <mathias@mathia|
| | s.d2g.com> |
| | |
| | 28.04.2002 |
| | 11:07 |
| | Bitte antworten|
| | an cocoon-dev |
| | |
|--------+------------------------->
>----------------------------------------------------------------------------|
| |
| An: [EMAIL PROTECTED] |
| Kopie: (Blindkopie: Volker Schmitt/BASF-AG/BASF) |
| Thema: AW: The encoding nightmare with StreamGenerator |
>----------------------------------------------------------------------------|
Hi,
I think the problem is the servlet engine which parses the parameters
out of the request. StreamGenerator simply takes the parameters from the
request object.
Tomcat will use ISO-8859-1 as character encoding if the browser like ie
or netscape is not sending the character encoding to the server.
Bad thing: it is hard coded in tomcat so you can not configure the
default encoding. (see: Tomcat sources org.apache.catalina.connector.
RequestBase method getReader())
The only solution which I found is not to send the post as
application/x-www-form-urlencoded but as multipart/form-data.
The result is that you get the content as binary and not already parsed
by the servlet engine. This should also work specially for xml streams
because of the <?xml version="1.0" encoding="UTF-8"?> statement to
identify the encoding.
Anyway, the StreamGenerater seems not to be able to handle
multipart/form-data as ContentType. Why?
Hope that helps.
Mathias Broekelmann
> -----Urspr�ngliche Nachricht-----
> Von: Robert Koberg [mailto:[EMAIL PROTECTED]]
> Gesendet: Sonntag, 28. April 2002 00:28
> An: [EMAIL PROTECTED]
> Betreff: Re: The encoding nightmare with StreamGenerator
>
> Hi Stefano.
>
> Is your xsl:output putting out utf-8 or iso?
>
> We have the same problem not using cocoon. We use JS to pre-parse for
> these kinds of things - trial and error... :(
>
> best,
> -Rob
>
>
> Stefano Mazzocchi wrote:
>
> >I have a browser that sends a POST request with:
> >
> > content-type: application/x-www-form-urlencoded
> >
> >and the hidden field "content" is populated (using client-side
> >javascript) with some xml which looks like this
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <page>
> > <title>Title</title>
> > <abstract>è</abstract>
> > ...
> > </page>
> >
> >the weird "è" text is the UTF-8 encoded value for [�] (depending on
> >your mail client you might not be getting nothing of the above as I
> >write it, but that's exactly part of the encoding nightmare that UTF
was
> >designed to fix... but there is still a long way to go)
> >
> >Now, I have use StreamGenerator to get this text, have it parsed and
> >feed my pipeline. So far so good.
> >
> >The problem is that stupid StreamGenerator doesn't recognize the
> >encoding (because the content-type doesn't have the 'charset:' part
> >defined (and IE can't be tweaked to emit that, AFAIK)) so it spits
the
> >charachers "as they are" (as they were ASCII encoded) (I used the
> >LogTransformer to witness this and the same weird 'è' appears in the
> >logs with no encoding translating taking place).
> >
> >It seems that StreamGenerator (or the parser instance it
instantiates)
> >fails to see that 'è' is not two 8bits chars but one 16bit char.
> >
> >I'm positive the bug resides on StreamGenerator: in fact, if I tweak
the
> >javascript to fill the form content with
> >
> > <?xml version="1.0" encoding="BLAH"?>
> >
> >the parser doesn't even trigger an error.
> >
> >I'm going to investigate how to patch this since I need it badly! but
if
> >you have any suggestions I'm all ears.
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, email: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]