José, Neither (a) nor (b) are UTF-8. These are sequences of XML character entities referring to Unicode code points. They are strictly the same, except that the first one uses hexadecimal values, while the second one uses decimal values.
Andreas On Thu, Nov 13, 2008 at 18:53, José Ferreiro <[EMAIL PROTECTED]> wrote: > Hello all, > > I have the Russian Word (taken as example): Основное > > that is encoded as UTF-8 by axis as: > > Основное (a) > > I may transmit this kind of information in a XML well formed packet using > axis 1.4 after a client request from the server to the client again. There > is no problem. The deserialization works perfectly. > > > However if I try to transmit applying wss4j with encryption signature and > timestamp the following error arises: > > org.apache.xml.security.encryption.XMLEncryptionException: An invalid XML > character (Unicode: 0x1e) > was found in the element content of the document. > > Therefore in order to avoid invalid characters in the packet I decide then > to escape all XML chars > using org.apache.commons.lang.StringEscapeUtils.escapeXML [1] > > > > In the client in order to recover the original world I decide to do an > unescapeXML [1], which gives this Unicode string: > > Основное (b) > > First, it should be concluded that I am not getting the same Unicode string > as at the beginning (a) where [(a) != (b)] > > I was then wondering what kind of encoding I got. > I looked at this web site http://2cyr.com/decode/?lang=en to understand more > and it looks like I got windows-1251 (see [2]) > that can be displayed in a browser as encoding="iso8859-1". > > My question is: Why didn't i get UTF-8 and how is it possible I got (b) > ????? > > > Thank you for your reading and any comments you might have. > > José Ferreiro > > Many thanks to Martin Gainty and Ognjen Blagojevic for already commeting and > helping in another thread I posted. > > > [1] - > http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html > [2] - http://en.wikipedia.org/wiki/CP1251 > > PS: Thanks to Martin and > > -- > José Ferreiro > MSc in Communication Systems, EPFL. > > >
