Hi,
there are several way to annotate that without changing the seeder.
Your rule won't work for several reason, e.g., the REGEXP condition
checks only the covered text of the matching rule element (W), which is
only one word.
Here are some ways to annotate it (not tested)
Option 1: a normal rule (I think ":" is included in MARKUP for UIMA Ruta
2.4.0)
RETAINTYPE(MARKUP);
MARKUP{REGEXP("<w:t>")} #{-> Text} MARKUP{REGEXP("</w:t>")};
or
MARKUP.ct=="<w:t>" #{-> Text} MARKUP.ct=="</w:t>";
Option 2: a simple regex rule
"<w:t>(.+?)</w:t>" -> 1 = Text;
http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.language.regexprule
Option 3: use HtmlAnnotator
something like:
ENGINE utils.HtmlAnnotator;
TYPESYSTEM utils.HtmlTypeSystem;
EXEC(HtmlAnnotator, {TAG});
TAG.name=="w:t"{-> Text};
The HtmlAnnotator can be configured to only annotate the content of xml
elements.
http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.ae.html
Best,
Peter
Am 17.02.2016 um 10:33 schrieb AmyJacksonKatrina:
Peter Klügl <peter.kluegl@...> writes:
Hi,
did the answer to your last mail help?
What changes did you try which had no effect?
Can you explain your use case in more detail?
There is probably a much easier solution than to modify the seed file.
Best,
Peter
Am 10.02.2016 um 06:34 schrieb AmyJacksonKatrina:
how can i edit seed file in uima ruta. that changes to be effect on
eclipse output. But whatever changes i made the eclipse output is
asusual.
Thanks in advance.
Thank you Peter. I have been trying to match text
<w:t>AnyText</w:t> in a xml file. But the regex pattern
which i used in a script
W{REGEXP("(<w:t>(.+?)</w:t>)")->MARK(Text)};
is not matching. So i want to know, can a ruta will accept long regex
pattern or will have to give that in seed.flex file. Help me with a
solution to match this text.