Two comments here:

* The assertion that "Axis2 and Java World understands by default only
UTF-8 encoded character sets" is not correct. XML is based on the
Unicode character _set_ (and that has nothing to do with Axis2/Java).
Any XML parser is required to support at least UTF-8 and UTF-16 as
character _encodings_, but in addition Woodstox (the parser used by
Axis2) supports at least those encodings recognized by Java [1].

* Characters with code points in the range 0-31, except for 9, 10 and
13 are forbidden in XML, regardless of the character encoding. If .NET
doesn't enforce this restriction, then that is Microsoft's decision.
That being said, there may be an option in the Woodstox parser to
disable invalid character checks. Please consult the Woodstox
documentation.

Andreas

[1] http://java.sun.com/j2se/1.5.0/docs/guide/intl/encoding.doc.html

On Tue, Feb 23, 2010 at 09:24, Stadelmann Josef
<josef.stadelm...@axa-winterthur.ch> wrote:
> Hi,
>
>
>
> I have a .NET WCF Client and Apache Axis2 Web Service. My client makes a
> custom binding enforcing HTTP Transport and UTF-8 encoding, this because
> Axis2 and Java World understands by default only UTF-8 encoded character
> sets. And well, recently I have run into a problem, that my VB/C# .NET WCF
> client has to send words (all in a SOAP/XML structure) to the server
> consisting of German Umlauts ‘ä’ ‘ö’ ‘ü’, which resulted in a crash
> somewhere in a web service routine. But again: my custom binding said truly
> to use UTF-8. Now I ask you: why does it sent a bad non UTF-8, a bad 8 bit
> character to the wire that my service crashes. Simple answer: because this
> custom binding with UTF-8 encoding forced does not work at all as it should.
> Also – I can prove it – because I have special behavior implemented as
> INTERCEPTORS for each outgoing and incoming message, client and server side.
> I reported it to Microsoft in one of this WCF Web Service .Net forums.
> Unfortunately nothing but warm air comes back. Then I decided to help
> myself. As I was unable to look deep into the code doing wrong (deep in some
> .NET assemblies), I wrote a routine to substitute all .NET characters to a
> UTF-8 encoding before I send the request to the stub. Since – my problem is
> gone. And off course – at my legacy server I have to turn UTF-8 double byte
> encoded chars like German Umlauts into DEC MULTINATIONAL CHARACTER SET a 8
> bit char set on input to legacy server as well before transfer back via
> AXIS2 to the wire. But that was already done before and for that I used some
> LIBXML routines IsoLatin2UTF8() and UTF82IsoLatin() written in ANSI C. Maybe
> if you enforce your .NET chars belonging to ISO Latin you can use this two
> routines too. The interesting thing is, that with my current binding, all
> UTF-8 encoded characters arriving as response from the AXIS2 Server are
> perfectly converted into what .NET WCF client and the VB.NET client needs to
> display them correct. So the problem I was faced with is on sending to the
> server.
>
>
>
> Maybe this helps you a bit.
>
> Josef.Stadelmann
>
> @axa-winterthur.ch
>
>
>
>
>
> Von: Oleg Kozlov [mailto:oleg.koz...@leadpoint.com]
> Gesendet: Montag, 22. Februar 2010 21:50
> An: java-user@axis.apache.org
> Betreff: illegal XML character 0x8 from a .Net SOAP service :(
>
>
>
> Hello,
>
>
>
> I have an Axis2 1.5.1 (Java) client, auto-generated by wsdl2java with
> XmlBeans framework.
>
>
>
> I'm integrating with a new vendor that has a SOAP web service written in
> .Net. The service is basically a wrapper around a legacy telephony switch.
> The service returns 1-9 binary numbers in its response in one of the XML
> elements. I have to point out that I our (client) side we do not even need
> to use that element at all, it just has some meaningless data... Apparently
> our vendor's other clients are all using .Net and the .Net framework does
> not care about inserting or parsing illegal characters in the SOAP Body ...
>
>
>
> So, I'm getting an exception (see below).
>
>
>
> I tried solving this problem by writing a custom Axis2 client module where I
> get access to the XML element that contains illegal characters and either
> set it's text value to blank string, or detach and get rid of the whole XML
> element altogether (since I don't need it). However, every time I iterate to
> the bad XML element - the parser blows up with the parsing error again
> inside my custom module. Also, due to the fact that Axis is using pull
> parsing method - there is no way to jump over the element with binary data,
> so I was not able to copy the entire XML DOM into a new DOM document
> skipping over the bad element.
>
>
>
> I could not find a way to get low level access to XML stream where I could,
> for example, read from an input stream and write to an output stream, and
> skip over the illegal character.
>
>
>
> Can anyone recommend another solution?
>
>
>
> ==============================
>
> org.apache.axis2.AxisFault: [com.ctc.wstx.exc.WstxLazyException] Illegal
> character entity: expansion character (code 0x8) not a valid XML character
>  at [row,col {unknown-source}]: [1,13020]
>  at org.apache.axis2.AxisFault.makeFault(AxisFault.java:430)
>  at
> leadpoint.voice.service.paetec.callrecords.client.ServiceStub.fromOM(ServiceStub.java:3315)
>  at
> leadpoint.voice.service.paetec.callrecords.client.ServiceStub.getEnhancedCDR(ServiceStub.java:1063)
>  at
> leadpoint.voice.service.paetec.callrecords.client.ServiceTest.testgetEnhancedCDR(ServiceTest.java:184)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at
> com.intellij.junit3.JUnit3IdeaTestRunner.doRun(JUnit3IdeaTestRunner.java:108)
>  at
> com.intellij.junit3.JUnit3IdeaTestRunner.startRunnerWithArgs(JUnit3IdeaTestRunner.java:42)
>  at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:165)
>  at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:60)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at com.intellij.rt.execution.application.AppMain.main(AppMain.java:110)
> Caused by: [com.ctc.wstx.exc.WstxLazyException]
> com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion
> character (code 0x8) not a valid XML character
>  at [row,col {unknown-source}]: [1,13020]
>  at
> com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:45)
>  at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:704)
>  at
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
>  at
> com.ctc.wstx.sr.BasicStreamReader.getTextCharacters(BasicStreamReader.java:830)
>  at
> javax.xml.stream.util.StreamReaderDelegate.getTextCharacters(StreamReaderDelegate.java:158)
>  at
> org.apache.axiom.om.impl.builder.SafeXMLStreamReader.getTextCharacters(SafeXMLStreamReader.java:113)
>  at
> org.apache.axiom.om.impl.llom.OMStAXWrapper.getTextCharacters(OMStAXWrapper.java:418)
>  at
> org.apache.axiom.om.util.OMXMLStreamReaderValidator.getTextCharacters(OMXMLStreamReaderValidator.java:239)
>  at
> org.apache.xmlbeans.impl.store.Locale.loadXMLStreamReader(Locale.java:1154)
>  at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:843)
>  at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:826)
>  at
> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:231)
>  at
> com.paetec.e800ws.xmlservices.GetEnhancedCDRResponseDocument$Factory.parse(GetEnhancedCDRResponseDocument.java:165)
>  at
> leadpoint.voice.service.paetec.callrecords.client.ServiceStub.fromOM(ServiceStub.java:3226)
>  ... 23 more
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity:
> expansion character (code 0x8) not a valid XML character
>  at [row,col {unknown-source}]: [1,13020]
>  at
> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:605)
>  at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
>  at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2375)
>  at
> com.ctc.wstx.sr.StreamScanner.checkAndExpandChar(StreamScanner.java:2321)
>  at
> com.ctc.wstx.sr.StreamScanner.resolveSimpleEntity(StreamScanner.java:1180)
>  at
> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4675)
>  at
> com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4124)
>  at
> com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3699)
>  at
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3647)
>  ... 34 more
>
> ==============================
>
>
>
> Thank you,
>
> Oleg.

Reply via email to