[
https://issues.apache.org/jira/browse/UIMA-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Richard Eckart de Castilho resolved UIMA-6162.
----------------------------------------------
Resolution: Fixed
> Concurrent binary serialization produces corrupt output
> -------------------------------------------------------
>
> Key: UIMA-6162
> URL: https://issues.apache.org/jira/browse/UIMA-6162
> Project: UIMA
> Issue Type: Bug
> Components: UIMA
> Affects Versions: 3.1.1SDK
> Reporter: Richard Eckart de Castilho
> Priority: Major
> Attachments: admin.ser
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> I suspect there could be an issue in `BinaryCasSerDes`.
> When deserializing the attached file `admin.ser`, I get this stack trace:
> {code:java}
> Caused by: java.lang.ClassCastException: class
> org.apache.uima.jcas.tcas.Annotation cannot be cast to class
> org.apache.uima.jcas.cas.Sofa (org.apache.uima.jcas.tcas.Annotation and
> org.apache.uima.jcas.cas.Sofa are in unnamed module of loader
> org.apache.catalina.loader.ParallelWebappClassLoader @4593ff34)at
> org.apache.uima.cas.impl.BinaryCasSerDes.makeSofaFromHeap(BinaryCasSerDes.java:1823)
> ~[uimaj-core-3.1.1.jar:3.1.1]at
> org.apache.uima.cas.impl.BinaryCasSerDes.getSofaFromAnnotBase(BinaryCasSerDes.java:1817)
> ~[uimaj-core-3.1.1.jar:3.1.1]at
> org.apache.uima.cas.impl.BinaryCasSerDes.createFSsFromHeaps(BinaryCasSerDes.java:1701)
> ~[uimaj-core-3.1.1.jar:3.1.1]at
> org.apache.uima.cas.impl.BinaryCasSerDes.reinit(BinaryCasSerDes.java:259)
> ~[uimaj-core-3.1.1.jar:3.1.1]at
> org.apache.uima.cas.impl.BinaryCasSerDes.reinit(BinaryCasSerDes.java:328)
> ~[uimaj-core-3.1.1.jar:3.1.1]at
> org.apache.uima.cas.impl.Serialization.deserializeCASComplete(Serialization.java:129)
> ~[uimaj-core-3.1.1.jar:3.1.1]{code}
> The code used to read the file before deserializing is as follows:
> {code:java}
> public static void readSerializedCas(CAS aCas, File aFile)
> throws IOException
> {
> try (ObjectInputStream is = new ObjectInputStream(new
> FileInputStream(aFile))) {
> CASCompleteSerializer serializer = (CASCompleteSerializer)
> is.readObject();
> deserializeCASComplete(serializer, (CASImpl) aCas);
> }
> catch (ClassNotFoundException e) {
> throw new IOException(e);
> }
> }
> {code}
> I set a breakpoint to BinaryCasSerDes:1608 which is a for loop iterating over
> the heap. Apparently, the first feature structure that is encountered is an
> annotation type which is NOT the SOFA. Then in line 1700, the deserializer
> tries to resolve the SOFA for this annotation but fails because it has not
> yet been deserialized. Eventually makeSofaFromHeap is called and checks if a
> SOFA needs to be created. It tries to look up the SOFAs ID (1) from
> csds.addr2fs.get(sofaAddr) (BinaryCasSerDes:1821) and generates a new SOFA.
> However, when the SECOND annotation is read and csds.addr2fs.get(sofaAddr)
> (BinaryCasSerDes:1821) is called again and tries to resolve the SOFA from
> addr 1, it gets the previously deserialized annotation instead of the SOFA
> annotation that had been created.
> The SOFA that has been implicitly created is added to the csds.addr2fs map at
> key 1... however, later in BinaryCasSerDes:1723, the key 1 is overwritten by
> the deserialized annotation:
> {code}
> if (!isSofa) { // if it was a sofa, other code added or pended it
> csds.addFS(fs, heapIndex); // this overrides to SOFA that was
> created at key 1 because heapIndex is also 1
> }
> {code}
> The heap looks something like this:
> {code}
> [0, 187, 1, 33, 46, 199, 200, 201, 44, 202, 187, 1, 33, 46, 203, 204, 205,
> 45, 206, 187, 1, 33, 46, 207, 208, 209, 46, 210, 187, 1, 33, 46, 211, 212,
> 213, 47, 214, 187, 1, 33, 46, 215, 216, 217, 48, 1, 187, 1,...
> {code}
> I guess that 187 is the type code of the first annotation and we can see it
> repeats a couple of times. The 1 seems to be the SOFA ID - the first feature
> of the feature structures. However, instead of 1 referring to the address of
> the SOFA, it points at the first annotation which is NOT a SOFA.
> Bug in the serialization code assuming that the SOFA is always in the first
> position?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)