Yes, this seems like a bug.

The range of the "attributes" feature should be an FSArray, each element of
which should be an instance of AttributeFS.

Can you create a Jira bug report for this, and maybe a patch?

-Marshall  (trying to get others to contribute :-) )

On 9/16/2011 1:00 PM, Greg Holmberg wrote:
>
> In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation 
> type is defined to have a feature named "attributes" of range-type FSArray 
> and element-type FSArray. 
>
> In a small sample of XMI output, I see MarkupAnnotations with "attributes" 
> values referencing objects of type AttributeFS, not FSArray.  For example: 
>
>   <tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" 
> attributes="110" name="a" qualifiedName="a" 
> uri="http://www.w3.org/1999/xhtml"; /> 
>
>   <tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" 
> value="/Title?0091209" /> 
>
>   
> Shouldn't the element-type of the "attributes" feature of the 
> MarkupAnnotation type be AttributeFS, not FSArray? 
>
>     <typeDescription>
>       <name>org.apache.uima.tika.MarkupAnnotation</name>
>       <description/>
>       <supertypeName>uima.tcas.Annotation</supertypeName>
>       <features>
>         <featureDescription>
>           <name>attributes</name>
>           <description/>
>           <rangeTypeName>uima.cas.FSArray</rangeTypeName>
>           <elementType>org.apache.uima.tika.AttributeFS</elementType>
>         </featureDescription>
>
>
> Thanks,
>
> Greg
>

Reply via email to