Re: Ruta - Token Order

Peter Klügl Tue, 21 May 2013 08:32:34 -0700

On 21.05.2013 16:15, Thilo Goetz wrote:
> On 05/21/2013 01:37 PM, Peter Klügl wrote:
>> Hi,
>>
>> On 21.05.2013 12:47, [email protected] wrote:
>>> Hi,
>>>
>>> In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2
>>> comes before a token with begin offset 0 and end offset 0. The token
>>> order is not as I expected. Thus in my case,
>>> SourceDocumentAnnotation was the second token in the token sequence
>>> and the rule didn't match. It took me some time to find that out.
>>> The end offset of SourceDocumentAnnotation should better be the
>>> length of the text. How is the token ordering defined?
>>
>> Annotations of the length 0 can be problematic in UIMA Ruta due to the
>> inference mechanism and should be avoided. The reason for this is the
>> complete disjoint partition of the document represented by the RutaBasic
>> annotations. If they have the length 0, then the match can be ambiguous.
>>
>> The token order should be almost identical to the normal UIMA order and
>> there should only be a difference for specific types.
>
> That is the normal order.  Longer annotations that start at the same
> position come before shorter ones.  Whether you agree with this
> decisions is another matter ;-)


You are of course right :-)

Peter



>
>> The type priorities are:
>> RutaFrame
>> Annotation
>> RutaBasic
>>
>> I will take a look at the situation you described. It sounds like a bug
>> for annotations of the length 0 and should not occur at all.
>>
>> May I ask with which rule you tried to match a token and the SDA?
>>
>> Best,
>>
>> Peter
>>
>>> Cheers,
>>> Armin
>>

Re: Ruta - Token Order

Reply via email to