Hello from a new XSP user.
I ask advice and suggestions on how to get dynamic
content in UTF-8 encoding correctly through XSP. I am using Cocoon 1.8.2 with
Tomcat 3.2.3 and Apache 1.3.14, running on Suse Linux 7.1.
I have made an <xsp:page> search
page that retrieves data from the dbXML database (soon to be Apache
Xindice) and integrates this with static XML content. dbXML has a java interface
and the results from database queries are delivered and inserted as DOM
nodes:
<xsp:expr>XSPUtil.cloneNode(fetchedDoc, document)</xsp:expr>
where fetchedDoc is the node that resulted from the database query. Static content is decoded correctly, but the dynamically fetched documents are not. Everything in my page and XSL stylesheet is UTF-8, as indicated in the xml declarations:<?xml version="1.0"
encoding="UTF-8"?>
Everything
in the dbXML data store is UTF-8 as well. XSP and the html formatter has been
set up to deliver UTF-8 in cocoon.properties:
processor.xsp.encoding = UTF-8
...
formatter.text/html.encoding =
UTF-8
The problem arises with multi-byte characters being
delivered in the DOM nodes from dbXML. They are fetched correctly with the
correct multi-byte contents, but somewhere in the processing pipeline each byte
is being treated as a single character and rendered as such in the XSL
transformation.
If you save the output as ISO-8859-1 and then read
it into an UTF-8 editor, the result is correct. So, somehow the interpretation
of dynamic and static content gets out of sync concerning the encoding. The
problem does not seem to be the XSLT however, as you get the same result
with the text/plain formatter.
How to fix this?
Many thanks for your attention. Anders
Anders
Conrad
Det Danske Sprog- og Litteraturselskab
IT-redaktør, cand.mag. Christians Brygge 1 E-mail: [EMAIL PROTECTED] 1219 København K Tlf. 33 13 06 60 |