Re: annotator based on regular expressions over (previous) annotations: state-of-work in UIMA?

Igor Sominsky Fri, 16 Jan 2009 17:05:19 -0800

Armando,

As now I understand your goals better, you are right on all of the pointthat you have made. Only the feature VALUES can be evaluated/transformedwith regular expressions. The overall search criteria must be explicitlyspecified using FESL tags. I like the idea of using regexps for the searchvery much, just not sure about a complexity level of the implementation,although I might be completely wrong overestimating it.

Please let me know if you need any other information on CFE or would like todiscuss it


Thanks
Igor

----- Original Message -----From: "Armando Stellato" <[email protected]>

To: <[email protected]>
Sent: Friday, January 16, 2009 7:16 PM

Subject: R: annotator based on regular expressions over (previous)annotations: state-of-work in UIMA?



Hi Igor,

thanks for the pointer. I've done a brief run under your LREC paper:

http://domino.research.ibm.com/comm/research_projects.nsf/pages/medicalinformatics.pubs.html/$FILE/CFE_sominsky-A4.pdf

and a presentation I found on the Web:

http://watchtower.coling.uni-jena.de/~coling/uimaws_lrec2008/slides/sominsky_20080531_talk_CFE.pdf

At a first glance, it seemed something quite different from what I needed.FESL is a (I hope not to abuse the term :-) ) trasformator from UIMAfeatures. The target may be new UIMA features or other kind of data (as forthe title of the paper and the example of figure 3, which suggests its usein Machine Learning, by extracting useful info from the existingannotations, which can feed a learner). However, I tried to understand itbetter, because it could anyway have the power to do what I was looking for,which is to apply regular expressions over the content of a document, withelements of the expressions being not only represented by strings, digitsetc.. but also by Annotation types. Like (with a very simple syntax) tellingthat:

.* {<PersonTitle> <Name>}

will extract a new Annotation called Person when matching the (previouslyannotated with PersonTitle and Name annotations) string: "Mr John Doe"Lastly, I think I found the problem: in the paper you mention Reg Exps asone of the 5 filters which can be applied to evaluate values (upper rightpart of page 3 of the paper), but the overall search mechanism (points froma) to f) upper LEFT part of page 3) is not based on regular expressions nor,I think, has their power (though I will delve into the details of point f)with further reading).

On the basis of what I got from the reading, I think it is not what I need,though it could surely be included as part of it. For example (again simplesyntax):


.* {<Person>} "salary" <Currency>:normalizedvalue > 300000€

To extract instances of RichPerson

If I missed some crucial aspect, please let me know,

Thanks in advance,

Armando Stellato

-----Messaggio originale-----
Da: Igor Sominsky [mailto:[email protected]]
Inviato: venerdì 16 gennaio 2009 22.59
A: [email protected]
Oggetto: Re: annotator based on regular expressions over (previous)
annotations: state-of-work in UIMA?

Armando,

In posted version of CFE you can alter the value of an extracted featureby

applying a Java regular expression. The code that is currently under
development would allow to combine several values by using Java regular
expressions or math expressions. The grammar of math expressions include
capability for using java functions and constants (through reflection)

I hope that answers your question. Please let me know if you need more
information

Thank
Igor


----- Original Message -----
From: "Armando Stellato" <[email protected]>
To: "UIMA" <[email protected]>
Sent: Friday, January 16, 2009 1:19 PM

Subject: annotator based on regular expressions over (previous)annotations:

state-of-work in UIMA?


> Hi all,
>
>
>
> From a few posts, like the one at the following link:
>
>
>
> http://osdir.com/ml/apache.uima.general/2008-05/msg00070.html
>
>
>
> it seems that there is some interest in seeing such kind of processor in
> the
> UIMA array of available components.
>
>
>

> Since we're considering working on developing a new one, but would> prefer> not to reinvent the wheel J, I'm asking if there is already someone> doing

> the same and, in case, get pointers to their work, know if it is
> available,
> if it's still in work-in-progress etc.
>
>
>
> Best regards,
>
>
>
> Armando Stellato
>
>
>
> --------------------------------------------------
>
>
>
> Ing. Armando Stellato, PhD
>
> AI Research Group,
>
> Dept. of Computer Science, Systems and Production
>
> University of Roma, Tor Vergata
>
> Via del Politecnico 1 00133 ROMA (ITALY)
>
> tel: +39 06 7259 7330 (office, room A1-14);
>
>     +39 06 7259 7332 (lab)
>
> fax: +39 06 7259 7460
>
> e_mail: [email protected]
>
> yahoo: stellato75
>
> jabber(gtalk): [email protected] <mailto:[email protected]>
>
> skype: odnamar
>
>
>
> --------------------------------------------------
>
>
>
>

Re: annotator based on regular expressions over (previous) annotations: state-of-work in UIMA?

Reply via email to