Henrik, I am not sure how you would do that on windows or linux, but I assume it can be done. On the OS/400, the XML parser will honor whatever the encoding is in the XML data that is returned. I think on the other platforms it is assumed to be in UTF-8 so I guess that answer for non-OS/400 platforms is yes, the XML data must be in UTF-8 (or I think consistent with UTF-8 such as ISO-8859-1, etc.)
Nadir K. Amra "Henrik Nordberg (JIRA)" <[email protected]> wrote on 04/21/2006 06:39:06 PM: > [ http://issues.apache.org/jira/browse/AXISCPP-964? > page=comments#action_12375652 ] > > Henrik Nordberg commented on AXISCPP-964: > ----------------------------------------- > > Hi Nadir, > > Are you saying that the strings that the web service implementation > send back, should already be in UTF-8? Or is there a way to set the > local in Apache to UTF-8 and have it do the conversion? In short, > how would I set the local to get it to work correctly, on Windows and Linux? > > Thanks! > - Henrik > > > Server response not UTF-8 encoded (but claims to be) > > ---------------------------------------------------- > > > > Key: AXISCPP-964 > > URL: http://issues.apache.org/jira/browse/AXISCPP-964 > > Project: Axis-C++ > > Type: Bug > > > Components: SOAP > > Versions: current (nightly) > > Environment: All platforms, except OS/400 > > Reporter: Henrik Nordberg > > > > > (See the end of this description for a one-liner that works around > this problem for most cases.) > > SoapSerializer.cpp, line 379 says > > serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL); > > that is that the SOAP response is UTF-8 encoded. But this is only > true for OS/400 as can be seen in HTTPTransport.cpp, lines 311- > > #ifndef __OS400__ > > *m_pActiveChannel << this->getHTTPHeaders (); > > *m_pActiveChannel << this->m_strBytesToSend.c_str (); > > #else > > // Ebcdic (OS/400) systems need to convert the data to > UTF-8. Note that free() is > > // correctly used and should not be changed to delete(). > > const char *buf = this->getHTTPHeaders (); > > utf8Buf = toUTF8((char *)buf, strlen(buf)+1); > > *m_pActiveChannel << utf8Buf; > > free(utf8Buf); > > utf8Buf = NULL; > > utf8Buf = toUTF8((char *)this->m_strBytesToSend.c_str(), > this->m_strBytesToSend.length()+1); > > *m_pActiveChannel << utf8Buf; > > free(utf8Buf); > > utf8Buf = NULL; > > #endif > > This leads to clients trying to decode the response as UTF-8, and > will have errors whenever the response contains non-ASCII characters > (i.e., > 127). > > Axis Java, for example, will prduce this error upon decoding: > > "java.io.UTFDataFormatException: Invalid byte 2 of 3-byte UTF-8 sequence." > > A simple workaround is to change SoapSerializer.cpp, line 379: > > from > > serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL); > > to > > serialize( "<?xml version='1.0' encoding='ISO-8859-1' ?>", NULL); > > The real fix, however, is to encode the response with UTF-8 for > all platforms (not just OS/400). > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the administrators: > http://issues.apache.org/jira/secure/Administrators.jspa > - > For more information on JIRA, see: > http://www.atlassian.com/software/jira >
