I was looking into an issue a coworker ran into calling the Xmethods Babelfish
translation service (specifically to test out non-ascii chars), and decided to
try a simple WSDL2Java based client against the service to see how it compared
to how he was using AXIS (1.2RC2), and I was able to reproduce nearly the same
problem.
I figured I'd ask here before digging too deep on this one - hoping that maybe
someone can spot either what I'm doing wrong or what the service itself is
doing wrong, or what may be a bug in 1.2RC2? I'm sure I've seen non-ascii
chars work just fine in AXIS response envelopes before, so I'm not sure what's
up with this one...
WSDL: http://www.xmethods.net/sd/2001/BabelFishService.wsdl
Service is at: http://services.xmethods.net:80/perl/soaplite.cgi
Sample request envelope
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><ns1:BabelFish
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ns1="urn:xmethodsBabelFish"><translationmode
xsi:type="xsd:string">en_fr</translationmode><sourcedata
xsi:type="xsd:string">I'm going to the
beach.</sourcedata></ns1:BabelFish></soapenv:Body></soapenv:Envelope>
Sample response envelope:
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"><SOAP-ENV:Body><namesp1:BabelFishResponse
xmlns:namesp1="urn:xmethodsBabelFish"><return xsi:type="xsd:string">je vais �
la plage
</return></namesp1:BabelFishResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>
With an HTTP content type header of:
Content-Type: text/xml; charset=utf-8
The AXIS Fault I'm getting back in the simple WSDL2Java generated client (unit
test hand-modified to just run as a standalone client without Junit) is
slightly different than what my coworker is getting, but both are the same
below Message.getSOAPEnvelope in the stack, where it's trying to parse the XML
for the SOAP envelope in both cases, via the DeserializationContext for the
response envelope.
The encoding for the response envelope is marked UTF-8 as is the HTTP response
headers associated with that envelope. The envelope looks fairly valid by eye
and a simple Java program that just posted the above request to the same
service and read and dumped out the response using a UTF-8 encoding didn't
complain about any of the bytes in the response.
Note: I don't get the error if the response contains just ascii chars.
Here's one exception I'm getting in the standalone test (MustUnderstand checker
asks for the SOAP env):
{http://xml.apache.org/axis/}stackTrace:org.xml.sax.SAXParseException:
Character conversion error: "Malformed UTF-8 char -- is an XML encoding
declaration missing?" (line number may be too low).
at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
at
org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1048)
at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:520)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:318)
at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
at
org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:226)
at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:645)
at org.apache.axis.Message.getSOAPEnvelope(Message.java:424)
And here's another another almost identical one from another batch of client
code using AXIS directly (without WSDL2Java)
AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: java.io.UTFDataFormatException: Invalid byte 2 of 3-byte UTF-8 seq
uence.
faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:java.io.UTFDataFormatException:
Invalid byte 2 of 3-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unk
nown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
Dispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
known Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at org.apache.axis.encoding.DeserializationContext.parse(Deserialization
Context.java:226)
at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:645)
at org.apache.axis.Message.getSOAPEnvelope(Message.java:424)
at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnders
tandChecker.java:62)
at org.apache.axis.client.AxisClient.invoke(AxisClient.java:173)
at org.apache.axis.client.Call.invokeEngine(Call.java:2719)
at org.apache.axis.client.Call.invoke(Call.java:2702)
at org.apache.axis.client.Call.invoke(Call.java:2378)
at org.apache.axis.client.Call.invoke(Call.java:2301)
at org.apache.axis.client.Call.invoke(Call.java:1758)