In the code snippet you've shown you're writing to a StringWriter. That is 
a character stream which collects its output into a StringBuffer so you're 
not actually writing UTF-8 byte sequences anywhere here. Perhaps there's 
some conversion code (which you've haven't shown) which takes that String 
and encodes it into the bytes of some other encoding that isn't UTF-8. 
It's an error if your document declares that it has a certain encoding 
(e.g. UTF-8) but is encoded as something else (e.g. Windows-1252).

Unless you have a good reason to be using the StringWriter I'd recommend 
using an OutputStream instead. That would give the Transformer the 
responsibility of getting the encoding right.


Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab

"Johnson, Wayne" <> wrote on 10/02/2014 11:26:21 AM:
> I have a Java program that is writing information from a database to
> an XML file.  I create a DOM document, add an element, and set the value 
>     Element id=parent.createElement("EventRuleInputDefinition");
> ...
>     id.setAttribute("Value", getVal());
> Later, I then go to write the Document with:
>         StringWriter sw = new StringWriter();
> ...
>             TransformerFactory transformerFactory =
>                 TransformerFactory.newInstance();
> ...
>                 Transformer transformer = 
>                 transformer.setOutputProperty (OutputKeys.ENCODING, 
>                 // Puts each stanza on a new line
>                 transformer.setOutputProperty(OutputKeys.INDENT, "yes");
>                 DOMSource source = new DOMSource(node); // node is 
> the document root.
>                 StreamResult result = new StreamResult(sw);
>                 transformer.transform(source, result);
> The XML file is properly generated with the header:
> <?xml version="1.0" encoding="UTF-8"?>
> But the element is written with an actual umlaut character which 
> when read back in generates the error:
> [Fatal Error] :564:103: Invalid byte 1 of 1-byte UTF-8 sequence.
> org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.
> We're using Xerces 2.8.1 (don't laugh, I know it's a bit old). 
> Could this be an issue in Xerces, or am I doing something wrong?
> Thanks.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to