On Tue, 2004-03-02 at 14:57, Marco Dubbeld wrote: > On Tue, 2004-03-02 at 13:29, Carsten Ziegeler wrote: > > Marco Dubbeld wrote: > > > > > > Yep! The value element may contain UTF-8, with chinese > > > characters or other non ISO-8859 encoding characters. While > > > testing, the > > > > > > this.startSerializedXMLRecording(XMLUtils.defaultSerializeToXM > > > LFormat(true)); > > > will use ISO-8859 encoding (see the properties given back > > > from XMLUtils). However we should use a property set with the > > > encoding from the document we are transforming. Otherwise > > > this causes UTFDataFormatException for chinese UTF-8 for example. > > > > So the best way would be to pass the encoding of the document > > to the XMLUtils used for serializing. Is this possible? > Properties props = XMLUtils.defaultSerializeToXMLFormat(true); > String encoding = ?????????? > props.set(OutputKeys.ENCODING, encoding); > this.startSerializedXMLRecording(props); > > How to determine most properly the encoding of the events or the input > source I do not precisly know. > java.sun.com is down from my location so one half of my brains is > blocked. If it's back I try to search. > > Maybe someone on the list knows ?
You just need to choose something. I don't think SAX provides information about the encoding of the original document. Always using UTF-8 should be a safe choice. -- Bruno Dumon http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center [EMAIL PROTECTED] [EMAIL PROTECTED]
