Hi all,
I have an example of rules that dont quite work, which leads me to
realization that I dont understand how text is traversed in ruta and how
rules are applied.
Below is a simplified example of what I m doing.
Say, i have a text that has "words" like this
1 aa+bb
2 aa / aa+bb
3 aa /aa /aa+bb
I want to annotate the tokens as follows
1 FOUND
2 FOUND / FOUND
3 FOUND / FOUND / FOUND
and there can be longer sequences separated by a slash.
These are my rules:
"aa" "+" "bb" {->MARK(FOUND,1,3)};
"aa" "/" FOUND {->MARK(FOUND, 1)};
In other words: the rightmost token of the sequence is annotated first as
FOUND. and this becomes an evidence to annotate preceeding tokens as FOUND
as well.
The thing is that only cases 1 and 2 are fully annotated. The case 3 is
annotated only partially.
1 FOUND
2 FOUND / FOUND
3 aa / FOUND / FOUND
Seems that the second rule is applied only once, though I expect it to be
applied many times in a loop as long as there is a match. The case 3 should
work as soon as the case 2 has been annotated, because case 3 is an
extension of case 2.
Case 3 starts to work when the second rule is duplicated. Which is not a
good solution, in my opinion. My question is: is the above by design (rule
matching does not restart after a match) or is it a bug in ruta? Or maybe
there is a configuration option to choose a behaviour?
Thank you in advance and best regards,
Nikolai