[ https://issues.apache.org/activemq/browse/SM-414?page=all ]
Juergen Mayrbaeurl updated SM-414:
----------------------------------
Attachment: SampleInMessage.xml
Sample In Message with non US-ASCII characters
> SourceTransformer cant transform to DOM with non US ASCII characters like 'ä'
> or 'ü'
> ------------------------------------------------------------------------------------
>
> Key: SM-414
> URL: https://issues.apache.org/activemq/browse/SM-414
> Project: ServiceMix
> Type: Bug
> Components: servicemix-core
> Versions: 3.0-M1, 3.0-M2, 3.0, incubation
> Environment: W2K, J2SE 1.4.2, Xerces 2.7.1, default locale of OS with
> character set 'windows-1252'
> Reporter: Juergen Mayrbaeurl
> Priority: Blocker
> Fix For: 3.0, incubation
> Attachments: SampleInMessage.xml, SourceTransformer-sources.zip
>
>
> The class org.apache.servicemix.jbi.jaxp.SourceTransformer, which belongs to
> the core classes of ServiceMix and is used very often, has major problems
> transforming Source to DOM data structures, when the source contains non
> US-ASCII charactes like 'ä' or 'ü'.
> The class uses DocumentBuilders (see method 'public DOMSource
> toDOMSourceFromStream(StreamSource source) throws
> ParserConfigurationException, IOException, SAXException') for the
> transformation and uses the method 'public Document parse(InputStream is,
> String systemId) throws SAXException, IOException' without explicitly telling
> the DocumentBuilder the character encoding it should use. This results in
> fatal errors (exceptions) returned by the DocumentBuilder (Xerces 2.7.1),
> because it encounters invalid character code sequences (especially with UTF-8
> and multi-byte characters like 'ä' or 'ö'). This means that you can't use non
> US-ASCII characters in messages, as soon as ServiceMix uses an instance of
> the class SourceTransformer to do any transformation to DOM. This is the case
> when tracing messages in the DeliveryChannel or evaluating an XPath
> expression for e.g. Content based routing.
> The solution to this problem is straight forward: Tell the DocumentBuilder
> the character encoding it has to use. Looks like:
> public DOMSource toDOMSourceFromStream(StreamSource source) throws
> ParserConfigurationException, IOException,
> SAXException {
> DocumentBuilder builder = createDocumentBuilder();
> String systemId = source.getSystemId();
> Document document = null;
> InputStream inputStream = source.getInputStream();
> if (inputStream != null) {
> InputSource inputsource = new InputSource(inputStream);
> inputsource.setSystemId(systemId);
> inputsource.setEncoding(defaultCharEncodingName); // <-- Very
> important
>
> document = builder.parse(inputsource);
> }
> else {
> Reader reader = source.getReader();
> if (reader != null) {
> document = builder.parse(new InputSource(reader));
> }
> else {
> throw new IOException("No input stream or reader available");
> }
> }
> return new DOMSource(document, systemId);
> }
> I've attached the original source file of SourceTransformer (3.0 SNAPSHOT,
> 2006-04-20) and the changed (Unfortunately I can't create a real patch).
> Kind regards
> Juergen
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira