Hello from a new XSP user.
 
I ask advice and suggestions on how to get dynamic content in UTF-8 encoding correctly through XSP. I am using Cocoon 1.8.2 with Tomcat 3.2.3 and Apache 1.3.14, running on Suse Linux 7.1.
 
I have made an <xsp:page> search page that retrieves data from the dbXML database (soon to be Apache Xindice) and integrates this with static XML content. dbXML has a java interface and the results from database queries are delivered and inserted as DOM nodes:

<xsp:expr>XSPUtil.cloneNode(fetchedDoc, document)</xsp:expr>

where fetchedDoc is the node that resulted from the database query.

Static content is decoded correctly, but the dynamically fetched documents are not. Everything in my page and XSL stylesheet is UTF-8, as indicated in the xml declarations:

<?xml version="1.0" encoding="UTF-8"?>
 
Everything in the dbXML data store is UTF-8 as well. XSP and the html formatter has been set up to deliver UTF-8 in cocoon.properties:
 
processor.xsp.encoding = UTF-8
...
formatter.text/html.encoding       = UTF-8
 
The problem arises with multi-byte characters being delivered in the DOM nodes from dbXML. They are fetched correctly with the correct multi-byte contents, but somewhere in the processing pipeline each byte is being treated as a single character and rendered as such in the XSL transformation.
 
If you save the output as ISO-8859-1 and then read it into an UTF-8 editor, the result is correct. So, somehow the interpretation of dynamic and static content gets out of sync concerning the encoding. The problem does not seem to be the XSLT however, as you get the same result with the text/plain formatter.
 
How to fix this?

Many thanks for your attention.
Anders
 
Anders Conrad                  Det Danske Sprog- og Litteraturselskab
IT-redaktør, cand.mag.       Christians Brygge 1
E-mail: [EMAIL PROTECTED]             1219 København K
                                        Tlf. 33 13 06 60

Reply via email to