We have a set of middleware connectors for 30-year-old, non-relational databases, including a number of connectors that will produce XML from the data. We encountered the same problem, and used the mixed content model, almost identical to what David showed (with an attribute). That way we could take and concatenate all the text nodes if the user wanted to ignore the "funny" character, or the user could reconstitute the original string. We provided them with sample code in Java, VB, C# and a few other languages to show how to reconstitute these strings, and also how to generate them themselves. The Base64 method was avoided because it made the output no longer "human-readable", which is one of the *benefits* of XML! ;)
-----Original Message----- From: David Sheldon [mailto:[EMAIL PROTECTED] Sent: Friday, November 04, 2005 5:24 AM To: [email protected] Subject: Re: Encoding invalid XML characters On Fri, Nov 04, 2005 at 12:52:29PM -0000, Tom Sugden wrote: > Hello, > > Can anybody suggest the best approach for encoding invalid XML characters > into an XML document? For example, the Unicode character with the > hexadecimal code 000C can be encoded into a Java character literal as > follows: > > char c = '\u000C'; > > I tried encoding this character into an XML string using a standard > character reference. For example: > > String s = "<tag></tag>"; I think the easiest way is to make the tag element have the type base64Binary. This way the string "\u000C\u000A\n000D" would become <tag>DAoN</tag> however the string "Hello world" would become <tag>SGVsbG8gd29ybGQ=</tag> Alternatively you could use mixed content, and so "Hello \u000C" would become: <tag>Hello <char number="12"/></tag> But this one would be harder to process by consumers of your system. I wonder what solutions other people have for this problem. David -- "[Hackers] then only have to crack the password to take control" -- IT Week on a terrible Unix security flaw --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
