SourceTransformer cant transform to DOM with non US ASCII characters like 'ä'
or 'ü'
------------------------------------------------------------------------------------
Key: SM-414
URL: https://issues.apache.org/activemq/browse/SM-414
Project: ServiceMix
Type: Bug
Components: servicemix-core
Versions: 3.0-M1, 3.0-M2, 3.0, incubation
Environment: W2K, J2SE 1.4.2, Xerces 2.7.1, default locale of OS with
character set 'windows-1252'
Reporter: Juergen Mayrbaeurl
Priority: Blocker
Fix For: 3.0, incubation
Attachments: SourceTransformer-sources.zip
The class org.apache.servicemix.jbi.jaxp.SourceTransformer, which belongs to
the core classes of ServiceMix and is used very often, has major problems
transforming Source to DOM data structures, when the source contains non
US-ASCII charactes like 'ä' or 'ü'.
The class uses DocumentBuilders (see method 'public DOMSource
toDOMSourceFromStream(StreamSource source) throws ParserConfigurationException,
IOException, SAXException') for the transformation and uses the method 'public
Document parse(InputStream is, String systemId) throws SAXException,
IOException' without explicitly telling the DocumentBuilder the character
encoding it should use. This results in fatal errors (exceptions) returned by
the DocumentBuilder (Xerces 2.7.1), because it encounters invalid character
code sequences (especially with UTF-8 and multi-byte characters like 'ä' or
'ö'). This means that you can't use non US-ASCII characters in messages, as
soon as ServiceMix uses an instance of the class SourceTransformer to do any
transformation to DOM. This is the case when tracing messages in the
DeliveryChannel or evaluating an XPath expression for e.g. Content based
routing.
The solution to this problem is straight forward: Tell the DocumentBuilder the
character encoding it has to use. Looks like:
public DOMSource toDOMSourceFromStream(StreamSource source) throws
ParserConfigurationException, IOException,
SAXException {
DocumentBuilder builder = createDocumentBuilder();
String systemId = source.getSystemId();
Document document = null;
InputStream inputStream = source.getInputStream();
if (inputStream != null) {
InputSource inputsource = new InputSource(inputStream);
inputsource.setSystemId(systemId);
inputsource.setEncoding(defaultCharEncodingName); // <-- Very
important
document = builder.parse(inputsource);
}
else {
Reader reader = source.getReader();
if (reader != null) {
document = builder.parse(new InputSource(reader));
}
else {
throw new IOException("No input stream or reader available");
}
}
return new DOMSource(document, systemId);
}
I've attached the original source file of SourceTransformer (3.0 SNAPSHOT,
2006-04-20) and the changed (Unfortunately I can't create a real patch).
Kind regards
Juergen
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira