Hi,

there are several way to annotate that without changing the seeder.

Your rule won't work for several reason, e.g., the REGEXP condition checks only the covered text of the matching rule element (W), which is only one word.

Here are some ways to annotate it (not tested)

Option 1: a normal rule (I think ":" is included in MARKUP for UIMA Ruta 2.4.0)
RETAINTYPE(MARKUP);
MARKUP{REGEXP("<w:t>")} #{-> Text} MARKUP{REGEXP("</w:t>")};
or
MARKUP.ct=="<w:t>" #{-> Text} MARKUP.ct=="</w:t>";

Option 2: a simple regex rule
"<w:t>(.+?)</w:t>" -> 1 = Text;
http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.language.regexprule

Option 3: use HtmlAnnotator
something like:
ENGINE utils.HtmlAnnotator;
TYPESYSTEM utils.HtmlTypeSystem;
EXEC(HtmlAnnotator, {TAG});
TAG.name=="w:t"{-> Text};

The HtmlAnnotator can be configured to only annotate the content of xml elements.
http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.ae.html

Best,

Peter


Am 17.02.2016 um 10:33 schrieb AmyJacksonKatrina:
Peter Klügl <peter.kluegl@...> writes:

Hi,

did the answer to your last mail help?
What changes did you try which had no effect?
Can you explain your use case in more detail?

There is probably a much easier solution than to modify the seed file.

Best,

Peter

Am 10.02.2016 um 06:34 schrieb AmyJacksonKatrina:
how can i edit seed file in uima ruta. that changes to be effect on
eclipse output. But whatever changes i made the eclipse output is
asusual.
Thanks in advance.




Thank you Peter. I have been trying to match text
    <w:t>AnyText</w:t> in a xml file. But the regex pattern
which i used in a script
  W{REGEXP("(<w:t>(.+?)</w:t>)")->MARK(Text)};
is not matching. So i want to know, can a ruta will accept long regex
pattern or will have to give that in seed.flex file.  Help me with a
solution to match this text.

Reply via email to