Bingo! I created the posted type system by reverse engineering it from the XMI provided to me by a third party. I have since requested the type system XML from them and I can now see that the original DocumentMetadata inherits directly from uima.cas.TOP! So, yes, the annotation did *not* inherit from AnnotationBase when it was serialized to XMI but I asked uimaj to create an annotation that *does* inherit from AnnotationBase when deserializing it back into a Java object.
Thank you for your time and for the detailed answer. I now see more clearly the complexity of the issue. Great work! Cheers, Pablo On Fri, May 13, 2016 at 6:33 AM, Marshall Schor <[email protected]> wrote: > Thanks for your report, analysis, and the data; very thorough and helpful. > > I was trying to imagine how the input XMI file could have become > "corrupted" in > this manner. I'm guessing that what may have happened is that something > outside > of the XMI file changed - for instance, the type system. > > This situation (of the sofa attribute being missing) could happen if the > type > definition for DocumentMetadata changed. > > I searched the source code for UIMA for "DocumentMetadata" as a UIMA type, > and > came up empty, so I'm guessing this is some type that was defined in your > particular application. I see in your note that you included a > description of > the type system defining DocumentMeta, and it shows that the supertype > hierarchy > of this type is: > > uima.tcas.DocumentAnnotation, whose supertype is > uima.tcas.Annotation, whose supertype is > uima.cas.AnnotationBase, whose supertype is > uima.cas.TOP > > I'm guessing that at the time the XMI serialized form was created, a > different > type system was being used that defined the supertype hierarchy for > DocumentMetadata such that it did **not** have uima.cas.AnnotationBase in > it > super type hierarchy. This would mean that it did not have the "sofa" > feature. > > Can you perhaps confirm that this (changing the type system in this > manner) was > probably the cause of this? > > ----------------- > > In figuring out what's the best thing to, it seems there are multiple > conditions > to maybe try and catch. To cause this failure: > > 1) a type was deserialized where the type system had some features that > were > missing in the serialization > > 2) the particular feature "sofa" was missing > > 3) the feature structure with the missing sofa feature was in a list in > the > serialization specifying it was to be added to the indexes > > (1) is not generally an error; it is allowed to permit evolution of type > systems > (in a compatible way) over time. For example, new features could be > added. Any > features not in the serialization are set to their default values. > > (2) even this is not (necessarily) an error (but it is bad practice). For > example, you might be using the initial view, and might have never created > a > Sofa. (Sofas are always created if you create a view programatically). > > Later versions of UIMA check for the sofa feature missing when attempting > to add > a Feature Structure to the indexes; this test was not always there, and, > for > backwards compatibility, it can be disabled with a > -Duima.disable_enhanced_check_wrong_add_to_index JVM property. > > ---------------- > > Because of this, I'm planning to leave the detection of this alone, but > will > change the error message to indicate some potential causes of this > situation, > including that the type system definition changed for this type from one > not > having uima.cas.AnnotationBase in the hierarchy when the serialized form > was > created, to the current type system being used for deserialization, which > does > (especially, if you can confirm this was the likely cause). > > Thank you very much for your report and analysis! > > -Marshall > > > > On 5/6/2016 6:30 PM, Pablo N. Mendes wrote: > > Folks, > > I am getting "No sofaFS for specified sofaRef found" while trying to > > deserialize an XMI. I found the message a bit cryptic and didn't find > much > > help on the lazyweb, so I bit the bullet and spent a few hours poking > > around. It seems to be a missing "sofa" attribute. If the sofa attribute > > has the wrong value, then you get "xmi id <id> is referenced but not > > defined" which is very nice and clear. But if you omit the sofa attribute > > you get "No sofaFS for specified sofaRef found" which is less informative > > IMHO. > > > > Extra info below. > > > > Cheers, > > Pablo > > > > $ diff cas1.xmi cas2.xmi > > 9c9 > > < <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt" > > documentId="001"/> > > --- > >> <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/> > > > > > > > > VERSIONS > > > > <uima.version>2.8.1</uima.version> > > <uimafit.version>2.1.0</uimafit.version> > > > > JAVA CODE SNIPPET > > > > org.apache.uima.util.XmlCasDeserializer.deserialize(inputStream, > > jCas.getCas()); > > > > STACK TRACE > > > > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: No > > sofaFS for specified sofaRef found. > > at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:806) > > at > > > org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS_common(FSIndexRepositoryImpl.java:2781) > > at > > > org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS(FSIndexRepositoryImpl.java:2763) > > at > > > org.apache.uima.cas.impl.FSIndexRepositoryImpl.addFS(FSIndexRepositoryImpl.java:2068) > > at > > > org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.endDocument(XmiCasDeserializer.java:1486) > > at > > > org.apache.uima.util.XmlCasDeserializer$XmlCasDeserializerHandler.endDocument(XmlCasDeserializer.java:127) > > at org.apache.xerces.parsers.AbstractSAXParser.endDocument(Unknown > Source) > > at org.apache.xerces.impl.XMLDocumentScannerImpl.endEntity(Unknown > Source) > > at org.apache.xerces.impl.XMLEntityManager.endEntity(Unknown Source) > > at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) > > at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source) > > at > > > org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dispatch(Unknown > > Source) > > at > > > org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown > > Source) > > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) > > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) > > at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) > > at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) > > at > > > org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:83) > > at > > > org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:58) > > ... > > > > > > CAS1.xmi > > > > <?xml version="1.0" encoding="UTF-8"?> > > <xmi:XMI > > xmlns:cas="http:///uima/cas.ecore" > > xmlns:tcas="http:///uima/tcas.ecore" > > xmlns:xmi="http://www.omg.org/XMI" > > xmlns:ls="http:///com/example.ecore" > > xmi:version="2.0"> > > <cas:NULL xmi:id="0"/> > > <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt" > > documentId="001"/> > > <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text" > > sofaString="This is a test."/> > > <cas:View sofa="1" members="18"/> > > </xmi:XMI> > > > > > > CAS2.xmi > > > > <?xml version="1.0" encoding="UTF-8"?> > > <xmi:XMI > > xmlns:cas="http:///uima/cas.ecore" > > xmlns:tcas="http:///uima/tcas.ecore" > > xmlns:xmi="http://www.omg.org/XMI" > > xmlns:ls="http:///com/example.ecore" > > xmi:version="2.0"> > > <cas:NULL xmi:id="0"/> > > <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/> > > <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text" > > sofaString="This is a test."/> > > <cas:View sofa="1" members="18"/> > > </xmi:XMI> > > > > > > TYPESYSTEM > > > > <?xml version="1.0" encoding="UTF-8" ?> > > > > <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier > "> > > <name>ExampleTypeSystem</name> > > <description>Just an example</description> > > <vendor>example.com</vendor> > > <version>1.0</version> > > <types> > > <typeDescription> > > <name>com.example.DocumentMetadata</name> > > <description></description> > > > > <supertypeName>uima.tcas.DocumentAnnotation</supertypeName> > > <features> > > <featureDescription> > > <name>source</name> > > <description>Source</description> > > > > <rangeTypeName>uima.cas.String</rangeTypeName> > > </featureDescription> > > <featureDescription> > > <name>documentId</name> > > <description>Source</description> > > > > <rangeTypeName>uima.cas.String</rangeTypeName> > > </featureDescription> > > </features> > > </typeDescription> > > > > </types> > > </typeSystemDescription> > > > > > > -- Pablo N. Mendes http://pablomendes.com
