Posting solution. The issue is with UTF8Encoder class of axis. The class does not consider surrogate characters. The solution is to override that class to handle surrogates.
Is this fixed in latest version of axis? Just curious. Thanks, Aman -----Original Message----- From: Amandeep Singh [mailto:[EMAIL PROTECTED] Sent: Monday, June 09, 2008 3:09 PM To: [email protected] Subject: RE: Invalid UTF-8 character encoding in SOAP response Thanks Andreas. My bad. The entity being produced is �� So, anyone who has axis 1 experience, any suggestions as to how to force axis to output correct entity? Thanks, Aman -----Original Message----- From: Andreas Veithen [mailto:[EMAIL PROTECTED] Sent: Monday, June 09, 2008 2:59 PM To: [email protected] Subject: Re: Invalid UTF-8 character encoding in SOAP response Aman, D869 DE1A is actually the surrogate pair for the character with code point 2A61A, which is encoded as F0AA989A in UTF-8 (see http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi) . The two other character references (��) correspond to another character. I'm not an expert, but the XML specs don't mention surrogate pairs and I think that the correct way of encoding the character as a character reference should be 𪘚 in this case. This definitely looks like a bug in the XML parser. I would try to replace the XML parser by a new version of the same parser or by another parser. I'm not familiar with Axis 1, so I don't know what kind of parser (SAX or StAX) it uses. Maybe somebody else on the list can give a hint? Andreas On 9 juin 08, at 22:18, Amandeep Singh wrote: > Hi All, > > I am using axis 1.3. If the response contains a CJK character in > UTF-8, axis converts it into an xml entity. On the receiver side, > xml parsing fails saying that it is an invalid xml entity. > > The character used has UTF-8 value F0AA989A. And axis converts it > into ����. And parser fails at first > entity. > > Any ideas/hints would be greatly appreciated? > > Thanks, > Aman --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
