Jaya Nageswar wrote:
Hi David,
Thanks for the update. I translated the characters from UCS-2 to UTF-8 using
C APIs. Actually i took these chinese characters(您如是) from Goolge Translate
and used in xml file to test the unicode support.When i translated these
characters from UCS-2 to UTF-8 using C APIs, i got these characters(귦꺡髧„).
Now i am not getting the errors from xerces parser.
I don't think you "got" any characters from the transcoding APIs. Also,
you need to be carefully when associating the glyphs you see on a
display device with a particular character, since they are dependent on
the font and the encoding assumed by the application and rendering system.
But i have a question. Will the characters themselves change from one format
to another format? If i have a string "abcd", will it change from one format
to another format? I understand the encoding in different formats is
different but i do not understand why the characters themselves are chaning
from one format to another format. Any information related to this will be a
great help to me.
I suggest you read this article on Wikipedia:
http://en.wikipedia.org/wiki/Character_encoding
Dave