I have the following problem. I have some beans that have attributes that may contain certain non-ascii characters (for example, the "o with a slash" or "a-e ligature" one sees in Danish words). When these are written out to xml attributes, they are not escaped. When I try to read beans with attributes containing these characters, the parser complains (org.xml.sax.SAXParseException: Character conversion error: "Unconvertible UTF-8 character beginning with 0xf8" (line number may be too low). at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100))
I've looked at the way in which attribute values are escaped, but haven't found much useful there. It seems characters such as ampersamds are escaped, but not non-ascii characters. I'm not sure this is really a betwixt bug or not - maybe the parser is incorrectly rejecting well-formed xml... I know that if I replace the character in the xml file with an escape (e.g., o slash with ø), the file is read correctly, but then it is consequently saved incorrectly (since the ampersand will now be escaped, yielding ø in the example above).
I've recently ported my system from my own home-brewed xml persistence mechanism to betwixt. In my system, I escaped all non-ascii characters with their equivalent decimal codes. I've looked through the faq and user mailing lists, but didn't see a reference to this problem. Can someone provide a hint as to how to get around this problem? Would it be preferrable to simply escape non-ascii characters? (If so, I could certainly provide a first-pass at the code for that.)
Thanks for any help.
-Peter
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
