Quoting "WJ Krpelan" <[EMAIL PROTECTED]>:

Hi,
hope I got this right.
The encoding with &#<hex>;  looks perfect to me.
You should check wether the actual hex-values correspond to the UNICODE-CODEPONTS of you Russian Characters.

Hmm, how do I do this?

If this is the case, how did you verify the characters were broken inside the DOM-tree. Is your tool capable of showing Russiaan characters?

Yes, I debugged it with Eclipse therefore I could see that the characters were not displayed correctly.

Broken would mean that the numeric values in your UTF-8 XML do not correspond to the UTF-8-values of your Russian Characters, which are quite different from the UNICODE-Codepoints.

HTH,
Wolfgang





--- On Fri, 8/8/08, Carsten Burghardt <[EMAIL PROTECTED]> wrote:

From: Carsten Burghardt <[EMAIL PROTECTED]>
Subject: Encoding problem
To: [email protected]
Date: Friday, August 8, 2008, 1:51 PM
Hi,

first of all I know that this is more a question for the
user list but
nobody could help me there - so apologies but I'll try
as I don't know
how to continue. I've a webservice (Axis 1.4) that
connects to an
Alfresco server and stores metadata from emails (like
subject, sender,
...). This works fine with ISO-* or UTF-8 encoded emails.
But once I
have an email with more "exotic" character sets
like KOI8-R (russian)
I get an error on the server side because of invalid
characters (like
0x1e). I know that no control characters are in the content
so I
watched the traffic with tcpmon and noticed that all
characters were
totally screwed up.
So I traced the Axis code and saw that the characters were
encoded
with &#<hex>; in the SoapBody. Afterwards the DOM
tree is serialized
in the DoAllSender class and then the characters are broken
in the
generated XML. When I switched the encoding of the Soap
Message to
KOI8-R instead of UTF-8 the characters showed up fine in
the tcpmon
but then the server reports an error about a different
illegal
character (0x1) which is probably because the message is
converted to
UTF-8 at a certain step.
So I guess my questions is: what is the proposed way to
transmit those
characters to a webservice (apart from Base64 encoding
them...)?

Many thanks

Carsten


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to