Jerry Cwiklik created UIMA-5791:
-----------------------------------

             Summary: UIMA-AS: fix client SAXParseException when deserializing 
metadata
                 Key: UIMA-5791
                 URL: https://issues.apache.org/jira/browse/UIMA-5791
             Project: UIMA
          Issue Type: Bug
          Components: Async Scaleout
            Reporter: Jerry Cwiklik
            Assignee: Jerry Cwiklik
             Fix For: 2.10.4AS


XML parser fails with SAXParseException when trying to deserialize service 
metadata. The scenario which causes the error is:

UIMA-AS client running on windows

Service runs on linux

The client sends getMeta request and receives a response from a service. The 
client tries to deserialize the meta and gets:

Jun 06, 2018 2:25:10 PM 
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl$2 
onMessageWARNING: org.apache.uima.util.InvalidXMLException: Invalid descriptor 
at <unknown source>.at 
org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:219)at 
org.apache.uima.util.impl.XMLParser_impl.parseResourceMetaData(XMLParser_impl.java:438)at
 
org.apache.uima.util.impl.XMLParser_impl.parseResourceMetaData(XMLParser_impl.java:420)at
 
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.handleMetadataReply(BaseUIMAAsynchronousEngineCommon_impl.java:1178)at
 
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl$2.run(BaseUIMAAsynchronousEngineCommon_impl.java:2065)at
 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)at
 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)at
 java.lang.Thread.run(Thread.java:811)Caused by: org.xml.sax.SAXParseException: 
Invalid byte 1 of 1-byte UTF-8 sequence.at 
org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)at 
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)at 
org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:202)... 7 
more

 

A workaround for the above was to set: -D"file.encoding-UTF-8" on the client.

Review the code and provided a fix. Perhaps XML InputSource has a way to set 
encoding. The default should be UTF-8. Seems like we need a new uima-as a new 
property (or command line arg) to override the default in case a user needs 
different encoding.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to