Hi, On 21.05.2013 12:47, [email protected] wrote: > Hi, > > In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes > before a token with begin offset 0 and end offset 0. The token order is not > as I expected. Thus in my case, SourceDocumentAnnotation was the second token > in the token sequence and the rule didn't match. It took me some time to find > that out. The end offset of SourceDocumentAnnotation should better be the > length of the text. How is the token ordering defined?
Annotations of the length 0 can be problematic in UIMA Ruta due to the inference mechanism and should be avoided. The reason for this is the complete disjoint partition of the document represented by the RutaBasic annotations. If they have the length 0, then the match can be ambiguous. The token order should be almost identical to the normal UIMA order and there should only be a difference for specific types. The type priorities are: RutaFrame Annotation RutaBasic I will take a look at the situation you described. It sounds like a bug for annotations of the length 0 and should not occur at all. May I ask with which rule you tried to match a token and the SDA? Best, Peter > Cheers, > Armin
