Armando,
As now I understand your goals better, you are right on all of the point
that you have made. Only the feature VALUES can be evaluated/transformed
with regular expressions. The overall search criteria must be explicitly
specified using FESL tags. I like the idea of using regexps for the search
very much, just not sure about a complexity level of the implementation,
although I might be completely wrong overestimating it.
Please let me know if you need any other information on CFE or would like to
discuss it
Thanks
Igor
----- Original Message -----
From: "Armando Stellato" <[email protected]>
To: <[email protected]>
Sent: Friday, January 16, 2009 7:16 PM
Subject: R: annotator based on regular expressions over (previous)
annotations: state-of-work in UIMA?
Hi Igor,
thanks for the pointer. I've done a brief run under your LREC paper:
http://domino.research.ibm.com/comm/research_projects.nsf/pages/medicalinformatics.pubs.html/$FILE/CFE_sominsky-A4.pdf
and a presentation I found on the Web:
http://watchtower.coling.uni-jena.de/~coling/uimaws_lrec2008/slides/sominsky_20080531_talk_CFE.pdf
At a first glance, it seemed something quite different from what I needed.
FESL is a (I hope not to abuse the term :-) ) trasformator from UIMA
features. The target may be new UIMA features or other kind of data (as for
the title of the paper and the example of figure 3, which suggests its use
in Machine Learning, by extracting useful info from the existing
annotations, which can feed a learner). However, I tried to understand it
better, because it could anyway have the power to do what I was looking for,
which is to apply regular expressions over the content of a document, with
elements of the expressions being not only represented by strings, digits
etc.. but also by Annotation types. Like (with a very simple syntax) telling
that:
.* {<PersonTitle> <Name>}
will extract a new Annotation called Person when matching the (previously
annotated with PersonTitle and Name annotations) string: "Mr John Doe"
Lastly, I think I found the problem: in the paper you mention Reg Exps as
one of the 5 filters which can be applied to evaluate values (upper right
part of page 3 of the paper), but the overall search mechanism (points from
a) to f) upper LEFT part of page 3) is not based on regular expressions nor,
I think, has their power (though I will delve into the details of point f)
with further reading).
On the basis of what I got from the reading, I think it is not what I need,
though it could surely be included as part of it. For example (again simple
syntax):
.* {<Person>} "salary" <Currency>:normalizedvalue > 300000€
To extract instances of RichPerson
If I missed some crucial aspect, please let me know,
Thanks in advance,
Armando Stellato
-----Messaggio originale-----
Da: Igor Sominsky [mailto:[email protected]]
Inviato: venerdì 16 gennaio 2009 22.59
A: [email protected]
Oggetto: Re: annotator based on regular expressions over (previous)
annotations: state-of-work in UIMA?
Armando,
In posted version of CFE you can alter the value of an extracted feature
by
applying a Java regular expression. The code that is currently under
development would allow to combine several values by using Java regular
expressions or math expressions. The grammar of math expressions include
capability for using java functions and constants (through reflection)
I hope that answers your question. Please let me know if you need more
information
Thank
Igor
----- Original Message -----
From: "Armando Stellato" <[email protected]>
To: "UIMA" <[email protected]>
Sent: Friday, January 16, 2009 1:19 PM
Subject: annotator based on regular expressions over (previous)
annotations:
state-of-work in UIMA?
> Hi all,
>
>
>
> From a few posts, like the one at the following link:
>
>
>
> http://osdir.com/ml/apache.uima.general/2008-05/msg00070.html
>
>
>
> it seems that there is some interest in seeing such kind of processor in
> the
> UIMA array of available components.
>
>
>
> Since we're considering working on developing a new one, but would
> prefer
> not to reinvent the wheel J, I'm asking if there is already someone
> doing
> the same and, in case, get pointers to their work, know if it is
> available,
> if it's still in work-in-progress etc.
>
>
>
> Best regards,
>
>
>
> Armando Stellato
>
>
>
> --------------------------------------------------
>
>
>
> Ing. Armando Stellato, PhD
>
> AI Research Group,
>
> Dept. of Computer Science, Systems and Production
>
> University of Roma, Tor Vergata
>
> Via del Politecnico 1 00133 ROMA (ITALY)
>
> tel: +39 06 7259 7330 (office, room A1-14);
>
> +39 06 7259 7332 (lab)
>
> fax: +39 06 7259 7460
>
> e_mail: [email protected]
>
> yahoo: stellato75
>
> jabber(gtalk): [email protected] <mailto:[email protected]>
>
> skype: odnamar
>
>
>
> --------------------------------------------------
>
>
>
>