Thanks Andreas. My bad. The entity being produced is ��
So, anyone who has axis 1 experience, any suggestions as to how to force axis to output correct entity? Thanks, Aman -----Original Message----- From: Andreas Veithen [mailto:[EMAIL PROTECTED] Sent: Monday, June 09, 2008 2:59 PM To: [email protected] Subject: Re: Invalid UTF-8 character encoding in SOAP response Aman, D869 DE1A is actually the surrogate pair for the character with code point 2A61A, which is encoded as F0AA989A in UTF-8 (see http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi) . The two other character references (��) correspond to another character. I'm not an expert, but the XML specs don't mention surrogate pairs and I think that the correct way of encoding the character as a character reference should be 𪘚 in this case. This definitely looks like a bug in the XML parser. I would try to replace the XML parser by a new version of the same parser or by another parser. I'm not familiar with Axis 1, so I don't know what kind of parser (SAX or StAX) it uses. Maybe somebody else on the list can give a hint? Andreas On 9 juin 08, at 22:18, Amandeep Singh wrote: > Hi All, > > I am using axis 1.3. If the response contains a CJK character in > UTF-8, axis converts it into an xml entity. On the receiver side, > xml parsing fails saying that it is an invalid xml entity. > > The character used has UTF-8 value F0AA989A. And axis converts it > into ����. And parser fails at first > entity. > > Any ideas/hints would be greatly appreciated? > > Thanks, > Aman --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
