Henrik, I am not sure how you would do that on windows or linux, but I 
assume it can be done.  On the OS/400, the XML parser will honor whatever 
the encoding is in the XML data that is returned.  I think on the other 
platforms it is assumed to be in UTF-8 so I guess that answer for 
non-OS/400 platforms is yes, the XML data must be in UTF-8 (or I think 
consistent with UTF-8 such as ISO-8859-1, etc.)

Nadir K. Amra


"Henrik Nordberg (JIRA)" <[email protected]> wrote on 04/21/2006 
06:39:06 PM:

>     [ http://issues.apache.org/jira/browse/AXISCPP-964?
> page=comments#action_12375652 ] 
> 
> Henrik Nordberg commented on AXISCPP-964:
> -----------------------------------------
> 
> Hi Nadir,
> 
> Are you saying that the strings that the web service implementation 
> send back, should already be in UTF-8? Or is there a way to set the 
> local in Apache to UTF-8 and have it do the conversion? In short, 
> how would I set the local to get it to work correctly, on Windows and 
Linux?
> 
> Thanks!
>  - Henrik
> 
> > Server response not UTF-8 encoded (but claims to be)
> > ----------------------------------------------------
> >
> >          Key: AXISCPP-964
> >          URL: http://issues.apache.org/jira/browse/AXISCPP-964
> >      Project: Axis-C++
> >         Type: Bug
> 
> >   Components: SOAP
> >     Versions: current (nightly)
> >  Environment: All platforms, except OS/400
> >     Reporter: Henrik Nordberg
> 
> >
> > (See the end of this description for a one-liner that works around
> this problem for most cases.)
> > SoapSerializer.cpp, line 379 says
> > serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL);
> > that is that the SOAP response is UTF-8 encoded. But this is only 
> true for OS/400 as can be seen in HTTPTransport.cpp, lines 311-
> > #ifndef __OS400__
> >         *m_pActiveChannel << this->getHTTPHeaders ();
> >         *m_pActiveChannel << this->m_strBytesToSend.c_str ();
> > #else
> >         // Ebcdic (OS/400) systems need to convert the data to 
> UTF-8. Note that free() is 
> >         // correctly used and should not be changed to delete(). 
> >         const char *buf = this->getHTTPHeaders ();
> >         utf8Buf = toUTF8((char *)buf, strlen(buf)+1);
> >         *m_pActiveChannel << utf8Buf;
> >         free(utf8Buf);
> >         utf8Buf = NULL;
> >         utf8Buf = toUTF8((char *)this->m_strBytesToSend.c_str(), 
> this->m_strBytesToSend.length()+1);
> >         *m_pActiveChannel << utf8Buf;
> >         free(utf8Buf);
> >         utf8Buf = NULL;
> > #endif
> > This leads to clients trying to decode the response as UTF-8, and 
> will have errors whenever the response contains non-ASCII characters
> (i.e., > 127).
> > Axis Java, for example, will prduce this error upon decoding: 
> > "java.io.UTFDataFormatException: Invalid byte 2 of 3-byte UTF-8 
sequence."
> > A simple workaround is to change SoapSerializer.cpp, line 379:
> > from
> > serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL);
> > to
> > serialize( "<?xml version='1.0' encoding='ISO-8859-1' ?>", NULL);
> > The real fix, however, is to encode the response with UTF-8 for 
> all platforms (not just OS/400).
> 
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
>    http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
>    http://www.atlassian.com/software/jira
> 

Reply via email to