I am sorry to bring this back from the dead. However I was just trying out
the unmarshal().xstream("ISO-8859-1") method introduced because of this
thread. Unfortunately it still does not solve the problem (as of Camel
2.5.0)
>From non-camel routes, we have been publishing JMS messages and serializing
the message to XML as follows:
XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
String messageXml = xstream.toXml(someObject);
Then using a producerTemplate to publish it to our messaging system.
When we used a route (like):
from(someIncomingEndpoint)
.unmarshal().xstream("ISO-8859-1")
.process(myUpdateProcessor);
Our processor received a deserialized message - but the content was not
correct. It took strings that were serialized as ISO-8859-1 and then it
deserialized it as UTF-8.
I modified our route to introduce a new Processor (instead of the in-line
unmashal) that did the following:
String messageBody = exchange.getIn().getBody(String.class);
XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
Object myObject = xstream.fromXml(messageBody );
exchange.getIn().setBody(myObject);
This works fine, the text our process receives is correct ISO-8859-1 and
nothing is garbled.
I set a breakpoint and stepped through the camel code with the in-line
unmarshal. It does pass down the encoding specified (ISO-8859-1). However
it constructs the XStream object using the default XppDriver (which you
can't specify an encoding on).
According to the XStream documentation - the XppDriver (and others not
including DomDriver) rely on the underlying InputStream/OutputStream passed
to the XStream object to determine the encoding.
I found in this method of AbstractXStreamWrapper.java:
public Object unmarshal(Exchange exchange, InputStream stream) throws
Exception {
HierarchicalStreamReader reader =
createHierarchicalStreamReader(exchange, stream);
try {
return
getXStream(exchange.getContext().getClassResolver()).unmarshal(reader);
} finally {
reader.close();
}
}
The "HierarchicalStreamReader " that is created is of type:
com.thoughtworks.xstream.io.xml.StaxReader
When I stepped in to the "unmarshal" method the XStream class - I saw that
the reader passed in (the same StaxReader) has a property called "in" that
was of type: com.ctc.wstx.sr.ValidatingStreamReader
This, in turn, had 2 properties:
mDocInputEncoding = {java.lang.String@4784}"ISO-8859-1"
mDocXmlEncoding = {java.lang.String@4785}"UTF-8"
While I can't say that this is why the text is coming out as UTF-8 - but it
does seem suspicious that although the InputEncoding is set to ISO-8859-1,
the XmlEncoding is still "UTF-8".
In any event - for our own purposes we have created 2 Processor classes to
serialize/deserialize our XML. We can't rely on the unmarshal/marshal
methods when it comes to encoding and our XML.
Just wanted to pass along the news that the fix doesn't seem to have solved
the problem.
--
View this message in context:
http://camel.465427.n5.nabble.com/XStream-and-forcing-ISO-8859-1-Encoding-tp478220p3355313.html
Sent from the Camel - Users mailing list archive at Nabble.com.