Re: AW: Ruta - Token Order

Peter Klügl Tue, 21 May 2013 04:53:06 -0700

Hi,

On 21.05.2013 13:49, [email protected] wrote:
> Hi Peter,
>
> I think that the rule doesn't matter. But I tried to find calender dates. To 
> find out what was going wrong I reduced the original more complex rule to
>  
> DECLARE Date;
> Document{->RETAINTYPE(BREAK, SPACE)};
> NUM{REGEXP("\\d\\d")->MARK(Date, 1, 2)} PERIOD;
>
> on the input text
>
> 12. Mai 1803
>
> I didn't use SourceDocumentInformation in the rule. I was just in the way. 
> The first token with RutaBasic was NUM(0, 2), that is "12", and the second 
> token was SourceDocumentInformation(0, 0). So the rule failed which is 
> correct.


Ah, thanks. I will try to reproduce it and fix it ASAP.

Best,

Peter


> Cheers,
> Armin
>
> -----Ursprüngliche Nachricht-----
> Von: Peter Klügl [mailto:[email protected]] 
> Gesendet: Dienstag, 21. Mai 2013 13:38
> An: [email protected]
> Betreff: Re: Ruta - Token Order
>
> Hi,
>
> On 21.05.2013 12:47, [email protected] wrote:
>> Hi,
>>
>> In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes 
>> before a token with begin offset 0 and end offset 0. The token order is not 
>> as I expected. Thus in my case, SourceDocumentAnnotation was the second 
>> token in the token sequence and the rule didn't match. It took me some time 
>> to find that out. The end offset of SourceDocumentAnnotation should better 
>> be the length of the text. How is the token ordering defined?
> Annotations of the length 0 can be problematic in UIMA Ruta due to the 
> inference mechanism and should be avoided. The reason for this is the 
> complete disjoint partition of the document represented by the RutaBasic 
> annotations. If they have the length 0, then the match can be ambiguous.
>
> The token order should be almost identical to the normal UIMA order and there 
> should only be a difference for specific types.
> The type priorities are:
> RutaFrame
> Annotation
> RutaBasic
>
> I will take a look at the situation you described. It sounds like a bug for 
> annotations of the length 0 and should not occur at all.
>
> May I ask with which rule you tried to match a token and the SDA?
>
> Best,
>
> Peter
>
>> Cheers,
>> Armin

Re: AW: Ruta - Token Order

Reply via email to