On 21.05.2013 16:15, Thilo Goetz wrote: > On 05/21/2013 01:37 PM, Peter Klügl wrote: >> Hi, >> >> On 21.05.2013 12:47, [email protected] wrote: >>> Hi, >>> >>> In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 >>> comes before a token with begin offset 0 and end offset 0. The token >>> order is not as I expected. Thus in my case, >>> SourceDocumentAnnotation was the second token in the token sequence >>> and the rule didn't match. It took me some time to find that out. >>> The end offset of SourceDocumentAnnotation should better be the >>> length of the text. How is the token ordering defined? >> >> Annotations of the length 0 can be problematic in UIMA Ruta due to the >> inference mechanism and should be avoided. The reason for this is the >> complete disjoint partition of the document represented by the RutaBasic >> annotations. If they have the length 0, then the match can be ambiguous. >> >> The token order should be almost identical to the normal UIMA order and >> there should only be a difference for specific types. > > That is the normal order. Longer annotations that start at the same > position come before shorter ones. Whether you agree with this > decisions is another matter ;-)
You are of course right :-) Peter > >> The type priorities are: >> RutaFrame >> Annotation >> RutaBasic >> >> I will take a look at the situation you described. It sounds like a bug >> for annotations of the length 0 and should not occur at all. >> >> May I ask with which rule you tried to match a token and the SDA? >> >> Best, >> >> Peter >> >>> Cheers, >>> Armin >>
