From what I read, Axis confuses encodings when the server doesn't send the Content-type header. Your tests below all have that sent.

Manuel Mall wrote:
I have written a quick and dirty Echo web service and it seems to work
fine. Below the log from SoapUI which uses httpclient, so the log is
actually in 'httpclient' format:

Here the client invokes the echo service with iso-8859-1 encoding and
"Abricó Gimarães" is sent as "Abric[0xf3] Gimar[0xe3]es".

"POST /axis2/services/EchoService HTTP/1.1[\r][\n]"
"Content-Type: text/xml;charset=iso-8859-1[\r][\n]"
"SOAPAction: "http://example/Echo/echoRequest"[\r][\n]";
"User-Agent: Jakarta Commons-HttpClient/3.0.1[\r][\n]"
"Host: 192.168.47.10:6060[\r][\n]"
"Content-Length: 326[\r][\n]"
"[\r][\n]"
"<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/";
xmlns:exam="http://example";>[\r][\n]"
"   <soapenv:Header/>[\r][\n]"
"   <soapenv:Body>[\r][\n]"
"      <exam:echo>[\r][\n]"
"         <exam:encoding>iso-8859-1</exam:encoding>[\r][\n]"
"         <exam:text>Abric[0xf3] Gimar[0xe3]es</exam:text>[\r][\n]"
"      </exam:echo>[\r][\n]"
"   </soapenv:Body>[\r][\n]"
"</soapenv:Envelope>"

Same in the response:

<< "HTTP/1.1 200 OK[\r][\n]"
<< "Date: Wed, 01 Nov 2006 15:29:13 GMT[\r][\n]"
<< "Server: Simple-Server/1.1[\r][\n]"
<< "Transfer-Encoding: chunked[\r][\n]"
<< "Content-Type: text/xml; charset=iso-8859-1[\r][\n]"
<< "1"
<< "1"
<< "d"
<< "[\r]"
<< "[\n]"
<< "<?xml version='1.0' encoding='iso-8859-1'?><soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/";><soapenv:Head
e r /><soapenv:Body><ns1:echoResponse
xmlns:ns1="http://example";><ns1:text>Abric[0xf3]
Gimar[0xe3]es</ns1:text></ns1:echoResponse></soapenv:Body></soapenv:Env
e lope>"
<< "[\r]"
<< "[\n]"

Here is the same using utf-8. Here "Abricó Gimarães" is sent as
"Abric[0xc3][0xb3] Gimar[0xc3][0xa3]es".

"POST /axis2/services/EchoService HTTP/1.1[\r][\n]"
"Content-Type: text/xml;charset=utf-8[\r][\n]"
"SOAPAction: "http://example/Echo/echoRequest"[\r][\n]";
"User-Agent: Jakarta Commons-HttpClient/3.0.1[\r][\n]"
"Host: 192.168.47.10:6060[\r][\n]"
"Content-Length: 323[\r][\n]"
"[\r][\n]"
"<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/";
xmlns:exam="http://example";>[\r][\n]"
"   <soapenv:Header/>[\r][\n]"
"   <soapenv:Body>[\r][\n]"
"      <exam:echo>[\r][\n]"
"         <exam:encoding>utf-8</exam:encoding>[\r][\n]"
"         <exam:text>Abric[0xc3][0xb3]Gimar[0xc3]
[0xa3]es</exam:text>[\r][\n]"
"      </exam:echo>[\r][\n]"
"   </soapenv:Body>[\r][\n]"
"</soapenv:Envelope>"

And the response:

<< "HTTP/1.1 200 OK[\r][\n]"
<< "Date: Wed, 01 Nov 2006 15:30:55 GMT[\r][\n]"
<< "Server: Simple-Server/1.1[\r][\n]"
<< "Transfer-Encoding: chunked[\r][\n]"
<< "Content-Type: text/xml; charset=utf-8[\r][\n]"
<< "1"
<< "1"
<< "a"
<< "[\r]"
<< "[\n]"
<< "<?xml version='1.0' encoding='utf-8'?><soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/";><soapenv:Header 
/><soapenv:Body><ns1:echoResponse
xmlns:ns1="http://example";><ns1:text>Abric[0xc3][0xb3]Gimar[0xc3]
[0xa3]es</ns1:text></ns1:echoResponse></soapenv:Body></soapenv:Envelope>"
<< "[\r]"
<< "[\n]"

So far everything looks good and nothing seems to be wrong with Axis2.

Manuel

-----Original Message-----
From: Manuel Mall [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 1 November 2006 22:13
To: [email protected]
Subject: Re: Converting from UTF-8 to ISO-8859-I with Axis2 and Java

On Wednesday 01 November 2006 20:36, Davanum Srinivas wrote:
Manuel,

that's exactly why i am asking folks to help with a test that we can
use to recreate the problem...based off of the following test for
UTF-16

http://svn.apache.org/repos/asf/webservices/commons/trunk/modules/axi
om/modules/axiom-tests/src/test/java/org/apache/axiom/soap/impl/llom/C
haracterEncodingTest.java

Davanum,

do you have a simple Axis service which just echoes a String somewhere?
(I don't have one handy).

By calling such a service with various encodings one should be able
test, for example using SoapUI,  if there is a problem with specific
encodings.

Manuel

-- dims

On 11/1/06, Manuel Mall <[EMAIL PROTECTED]> wrote:
On Wednesday 01 November 2006 20:27, Xinjun Chen wrote:
Hi,

I have the similar problem.
I am using the Axis2 client, when the end service send RIGHT
SINGLE QUOTATION MARK (U+2019, or 0xE2 0x80 0x99, or e28099).
From Ethereal, I can see the bytes e2 80 99. However, Axis2 just
convert it to question mark (0x3F). I am not sure whether it is
also caused by this bug.
Yes, could be a similar / related problem.

What encoding is set in the XML prolog of the message?

How did you determine that Axis2 converts it to a question mark?

Manuel

I am going to rollout the project. Is there any estimated
timeline for the resolve of this problem?

Thank you.

Regards,
Xinjun


On 11/1/06, Davanum Srinivas < [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>  > wrote:

Bruno,

Please log a JIRA bug with your sample.

thanks,
dims

On 11/1/06, Manuel Mall < [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]> >

wrote:
On Wednesday 01 November 2006 13:43, Bruno Negrao wrote:
Bruno,

encoding issues in my experience can be very tricky to debug.
Never rely on any characters you see on the screen or in
print to figure out the actual encoding used. The tool which
displays / prints may have weaved its own encoding/decoding
'magic' and may show you distorted results. Look at the
actual byte stream and check how the characters are actually
encoded in it, i.e. how does the message appear 'on the
wire'.

Manuel


Manuel,

I already tried my tests without tcpmon in the middle and the
problems persisted anyway.
Bruno,

you may well be correct that the problem is in Axis but to be
sure I would recommend you use a 'proper' protocol analyser,
e.g. Ethereal (  <http://www.ethereal.com/> www.ethereal.com)
worked well for me in
the past when I needed to look

at 'bytes on the wire'.

Manuel

But I'll check there on the server side if at least the words
were correctly stored as iso-8859-1.

But this does not dismiss my observation about
response.getXMLStreamReader().getCharacterEncodingScheme()

That method should definitely return "iso-8859-1" instead of
"utf-8".

I believe this is a proof that axis2 is not doing what it was
supposed to.

I'd ask you to carefully read my previous posting and analyze
those evidences I showed.

Thank you,
bruno.
---------------------------------------------------------------
---- -- To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
[EMAIL PROTECTED]

For additional commands, e-mail: [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>




--
Davanum Srinivas : http://www.wso2.net <http://www.wso2.net/>
(Oxygen for Web Service Developers)

-----------------------------------------------------------------
---- To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
[EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>
-------------------------------------------------------------------
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For
additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to