I was having trouble outputting special characters
in French and German from a document created with dom4j. The characters
originated in the text body of an element. The document would be
generated on the server (in a web application) and transmitted to a
client (Java applet), where the French or German text would be displayed.
But on the client, the characters (like an "e" with an accent)
would show up corrupted.
I managed to fix this problem by using an
org.dom4j.io.XMLWriter
instead of a regular
OutputStreamWriter.
Here's the server-side code that worked:
private
byte[] convertToByteArray(Document document) {
ByteArrayOutputStream
out = new ByteArrayOutputStream();
XMLWriter
writer = null;
try
{ writer = new XMLWriter(out); }
catch
(UnsupportedEncodingException uex) {
uex.printStackTrace(System.out);
}
try
{
writer.write(document);
writer.flush();
}
catch
(IOException iox) {
iox.printStackTrace(System.out);
}
return
out.toByteArray();
}
Here's the server-side code that
failed:
private
byte[] convertToByteArray(Document document) {
ByteArrayOutputStream
out = new ByteArrayOutputStream();
OutputStreamWriter
writer = new OutputStreamWriter(out);
try
{
document.write(writer);
writer.flush();
}
catch
(IOException iox) {
}
return
out.toByteArray();
}
My next task is to output and transmit
double-byte (Japanese) characters in the text of an element. I am hoping
the above change will suffice.