Yes, this seems like a bug. The range of the "attributes" feature should be an FSArray, each element of which should be an instance of AttributeFS.
Can you create a Jira bug report for this, and maybe a patch? -Marshall (trying to get others to contribute :-) ) On 9/16/2011 1:00 PM, Greg Holmberg wrote: > > In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation > type is defined to have a feature named "attributes" of range-type FSArray > and element-type FSArray. > > In a small sample of XMI output, I see MarkupAnnotations with "attributes" > values referencing objects of type AttributeFS, not FSArray. For example: > > <tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" > attributes="110" name="a" qualifiedName="a" > uri="http://www.w3.org/1999/xhtml" /> > > <tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" > value="/Title?0091209" /> > > > Shouldn't the element-type of the "attributes" feature of the > MarkupAnnotation type be AttributeFS, not FSArray? > > <typeDescription> > <name>org.apache.uima.tika.MarkupAnnotation</name> > <description/> > <supertypeName>uima.tcas.Annotation</supertypeName> > <features> > <featureDescription> > <name>attributes</name> > <description/> > <rangeTypeName>uima.cas.FSArray</rangeTypeName> > <elementType>org.apache.uima.tika.AttributeFS</elementType> > </featureDescription> > > > Thanks, > > Greg >
