Re: Ruta - Token Order

Peter Klügl Tue, 21 May 2013 04:38:07 -0700

Hi,

On 21.05.2013 12:47, [email protected] wrote:
> Hi,
>
> In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes 
> before a token with begin offset 0 and end offset 0. The token order is not 
> as I expected. Thus in my case, SourceDocumentAnnotation was the second token 
> in the token sequence and the rule didn't match. It took me some time to find 
> that out. The end offset of SourceDocumentAnnotation should better be the 
> length of the text. How is the token ordering defined?


Annotations of the length 0 can be problematic in UIMA Ruta due to the
inference mechanism and should be avoided. The reason for this is the
complete disjoint partition of the document represented by the RutaBasic
annotations. If they have the length 0, then the match can be ambiguous.

The token order should be almost identical to the normal UIMA order and
there should only be a difference for specific types.
The type priorities are:
RutaFrame
Annotation
RutaBasic

I will take a look at the situation you described. It sounds like a bug
for annotations of the length 0 and should not occur at all.

May I ask with which rule you tried to match a token and the SDA?

Best,

Peter

> Cheers,
> Armin

Re: Ruta - Token Order

Reply via email to