Anna Simbirtsev wrote:
Hi,
Do you know if you can give me an example of how to transcode utf-8
string to unicode and back? I think if I get the string in utf-8
encoding, I need to convert it to unicode before I pass it into xerces
parser?
UTF-8 is an encoding of Unicode, so I'm not sure I understand your
question. Xerces-C uses UTF-16 internally, so you would need to
transcode strings from UTF-8 to UTF-16 for APIs that expect arrays of
UTF-16 code units, such as DOMDocument::createElement(const XMLCh*
tagName). You can, however, parse UTF-8 documents without transcoding them.
There was a thread last week that discussed some of the issues with
local code page transcoding and you will find a link to an earlier
thread that has some transcoding code snippets.
Dave