Hi,
I am trying to write RUTA rules using regular expressions and capturing
groups. I want the matches to be line by line. I can do this using the
following script
ENGINE utils.PlainTextAnnotator;
TYPESYSTEM utils.PlainTextTypeSystem;
Document{-> RETAINTYPE(BREAK)};
Document{-> EXEC(PlainTextAnnotator)};
DECLARE Rule1NoPattern, Group1, Group2;
Line{REGEXP(".*no|No (.*)") -> Rule1NoPattern};
Given this text
Not pregnant or nursing
Fertile patients must use effective contraception (hormonal contraception
or intra-uterine device [IUD])
No concurrent participation in another clinical trial that would preclude
the interventions or outcome assessment of this clinical trial
No other concurrent anticancer therapy
it correctly matches the last two lines and annotates them with
Rule1NoPattern
The problem is, I want to use the capturing group information as well. I
can do this using the simple regular expression syntax
".*no|No (.*)\n|S" -> Rule1NoPattern, 1=Group1;
if I just give it one line, say
No other concurrent anticancer therapy
it will correctly annotate the entire line with Rule1NoPattern, and "other
concurrent anticancer therapy" wll be annotated with Group1.
Is there a way, using the first rule variant
Line{REGEXP(".*no|No (.*)") -> Rule1NoPattern};
to annotate the text in capturing group?
I have tried all kinds of syntax, but none of it seems to be correct
thanks,
Bonnie MacKellar