Hello! I’m using IcedTea 2.4.5 for OpenJDK 7 on Gentoo. This includes JAXP revision 8fe156ad49e2 (in the IcedTea repos) which again seems to contain jdk7u51-b31[0] – which again is revision 626e76f127a4 in the OpenJDK repo jdk7u, as far as I can see. Oh well, you’ll probably know better about all these version numbers than I do.
Anyway, after a painful debugging session I found that the default XML transformer implementation (via XSLTC) handles encodings improperly when writing in-memory DOM Documents (which had an encoding other than UTF-8 specified when being parsed) to a stream. I’m attaching my test code, which I hope is correct and readable. What it does: • read a document with encoding="ISO-8859-1" from an input stream into a DOM Document. The input document itself does not contain any characters outside US-ASCII, which is a subset of ISO-8859-1. • Add a text node with text “schön” (=nice in German) to the document. The “ö” in “schön” is LATIN SMALL LETTER O WITH DIAERESIS (U+00F6). This can, of course, be stored in the in-memory document tree, but may need character conversions when storing it later. • Use a Transformer with output properties set to XML in UTF-8 for writing the document into a stream using an identity transformation. I compared Xalan-J 2.7.1 and the internal implementation (older Xalan?) in my JRE installed with my version of OpenJDK (see above). External Xalan produces documents with XML encoding="UTF-8", while the JRE-internal Xalan keeps encoding="ISO-8859-1", *but writes the “ö” encoded in UTF-8*! This produces wrong content in the document when processing it with an XML parser later. The transformer should use UTF-8, as I requested in the code. If I did not specificially request anything, it might also have used ISO-8859-1 if transcoding all characters into that encoding. In order to use the attached test program, put xalan.jar and xsltc.jar from Xalan-J into your classpath. Even XSLTC from Xalan-J 2.7.1 works, just not the JRE-internal one. My default locale has UTF-8 encoding, in case that matters. [0] http://icedtea.classpath.org/hg/release/icedtea7-forest-2.4/jaxp/rev/8fe156ad49e2 [1] http://hg.openjdk.java.net/jdk7u/jdk7u/rev/a831c212ee26 -- Nico