I'm not sure, but it seems to me that the encoding specified in the source
XML file doesn't make any difference. I came accross some comments in the
source code saying thet the setEncoding() method hasn't been implemented
yet.

You can specify an encoding when serializing. But the default is UTF-8, I
think...
If you want to try, use for example
    <map:serializer name="html" mime-type="text/html"
src="org.apache.cocoon.serialization.HTMLSerializer">
         <encoding>ISO-8859-1</encoding>
    </map:serializer>
in your sitemap.

As I said, we experienced almost the same problem, but only when data was
written to disk (e.g. logfiles). So you can try adding the encoding to the
serializer.
You can also try adding the file.encoding, we're using Cocoon with Tomcat,
so I put -Dfile.encoding=ISO8859_1 in the TOMCAT_OPTS variable used when
tomcat is started...
A list of encodings:
http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html

Hope this helps, this is what solved our problems...


Jan Uyttenhove
Software Engineer

-----Original Message-----
From: Wes Morgan [mailto:[EMAIL PROTECTED]]
Sent: donderdag 2 augustus 2001 15:42
To: [EMAIL PROTECTED]
Subject: Re: [C2b2] Unicode output


I don't know if this will help me or not. I am running RedHat Linux 6.2
and I do not currently use XSP anywhere. I also do not specify an
encoding when serializing. The XML source file has encoding="UTF-8" in
it's <?xml ...?> PI, do I also have to say something to that effect in
my sitemap in the <map:serialize .../> element? Thank you for your response.

Wes Morgan

Jan Uyttenhove wrote:

>We had the same problem with french characters.
>I had to set the file.encoding system property (on Solaris), because the
>generated java code (from xsp) and our logging both contained '?' instead
of
>the french characters. So the problem actually occurred when writing on
>disk...
>
>Don't know if this helps you out.
>What's your OS? Do you use xsp's?
>Do you specify an encoding when serializing?
>
>Jan
>
>
>Jan Uyttenhove
>Software Engineer
>
>
>
>
>-----Original Message-----
>From: Wes Morgan [mailto:[EMAIL PROTECTED]]
>Sent: maandag 30 juli 2001 22:09
>To: [EMAIL PROTECTED]
>Subject: [C2b2] Unicode output
>
>
>I am working with some XML documents that have Greek and Hebrew Unicode
>characters in them. I am using Unicode fonts on the client side, so I need
>these characters to come through unmodified, but Cocoon changes them to '?'
>if it doesn't recognize them and to a character entity (e.g. &epsilon;)
when
>it does. This is bad, I need Cocoon to leave these characters alone. Any
>suggestions? Thanks.
>
>Wes Morgan
>www.ccel.org
>
>
>---------------------------------------------------------------------
>Please check that your question has not already been answered in the
>FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
>To unsubscribe, e-mail: <[EMAIL PROTECTED]>
>For additional commands, e-mail: <[EMAIL PROTECTED]>
>
>
>
>
>---------------------------------------------------------------------
>Please check that your question has not already been answered in the
>FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
>To unsubscribe, e-mail: <[EMAIL PROTECTED]>
>For additional commands, e-mail: <[EMAIL PROTECTED]>
>



---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>

Reply via email to