Many Thanks DJ, I'll surely go over it. One question: I saw on the FAQ that it is downloadable on a 90-days evaluation period. Is it going to be freely usable and/or open-source or not? In affirmative case, is there any way to go beyond evaluation if the user is satisfied? Cheers, Armando
> -----Messaggio originale----- > Da: D.J. McCloskey [mailto:[email protected]] > Inviato: sabato 17 gennaio 2009 12.41 > A: [email protected] > Oggetto: Re: R: annotator based on regular expressions over (previous) > annotations: state-of-work in UIMA? > > Hi Armando, > > Have you had a look at the LanguageWare technology on alphaWorks?. I > think > it might be what you are looking for. Take a look at the technology > posted > here (http://www.alphaworks.ibm.com/tech/lrw) on IBM's alphaworks site > - it > seems really close to what you are looking for. > > What is there is an eclipse based workbench for configuring an > aggregate > analyzer i.e. rules and dictionaries which then drive a UIMA pipeline > consisting of language identification, lexical analysis with linguistic > normalization, POS Tagging and Finite state transducer based rule > annotator > which operates over annotations and features in the CAS. > > The UI doesn't expose all the capabilities in the underlying annotators > but > I'd be really interested to have your opinions about it. > Feel free to contact us through the mail address in the FAQ for > specifics. > > Regards, > -DJ > ------------------- > D.J McCloskey > IBM LanguageWare Architect > > ... our external website: > http://www- > 306.ibm.com/software/globalization/topics/languageware/index.jsp > ... our Alphaworks: http://www.alphaworks.ibm.com/tech/lrw > ... our Wikipedia: http://en.wikipedia.org/wiki/Languageware > > IBM Ireland Product Distribution Limited registered in Ireland with > number > 92815. Registered office: Oldbrook House, 24-32 Pembroke Road, > Ballsbridge, Dublin 4 > > > > From: "Armando Stellato" <[email protected]> > > To: <[email protected]> > > Date: 17/01/2009 00:18 > > Subject: R: annotator based on regular expressions over (previous) > annotations: state-of-work in UIMA? > > > > > > > Hi Igor, > > thanks for the pointer. I've done a brief run under your LREC paper: > > http://domino.research.ibm.com/comm/research_projects.nsf/pages/medical > informatics.pubs.html/$FILE/CFE_sominsky-A4.pdf > > > and a presentation I found on the Web: > > http://watchtower.coling.uni- > jena.de/~coling/uimaws_lrec2008/slides/sominsky_20080531_talk_CFE.pdf > > > At a first glance, it seemed something quite different from what I > needed. > FESL is a (I hope not to abuse the term :-) ) trasformator from UIMA > features. The target may be new UIMA features or other kind of data (as > for > the title of the paper and the example of figure 3, which suggests its > use > in Machine Learning, by extracting useful info from the existing > annotations, which can feed a learner). However, I tried to understand > it > better, because it could anyway have the power to do what I was looking > for, which is to apply regular expressions over the content of a > document, > with elements of the expressions being not only represented by strings, > digits etc.. but also by Annotation types. Like (with a very simple > syntax) > telling that: > .* {<PersonTitle> <Name>} > will extract a new Annotation called Person when matching the > (previously > annotated with PersonTitle and Name annotations) string: "Mr John Doe" > Lastly, I think I found the problem: in the paper you mention Reg Exps > as > one of the 5 filters which can be applied to evaluate values (upper > right > part of page 3 of the paper), but the overall search mechanism (points > from > a) to f) upper LEFT part of page 3) is not based on regular expressions > nor, I think, has their power (though I will delve into the details of > point f) with further reading). > > On the basis of what I got from the reading, I think it is not what I > need, > though it could surely be included as part of it. For example (again > simple > syntax): > > .* {<Person>} "salary" <Currency>:normalizedvalue > 300000€ > > To extract instances of RichPerson > > If I missed some crucial aspect, please let me know, > > Thanks in advance, > > Armando Stellato > > > > -----Messaggio originale----- > > Da: Igor Sominsky [捯mailto:[email protected]] > > Inviato: venerdì 16 gennaio 2009 22.59 > > A: [email protected] > > Oggetto: Re: annotator based on regular expressions over (previous) > > annotations: state-of-work in UIMA? > > > > Armando, > > > > In posted version of CFE you can alter the value of an extracted > feature > by > > applying a Java regular expression. The code that is currently under > > development would allow to combine several values by using Java > regular > > expressions or math expressions. The grammar of math expressions > include > > capability for using java functions and constants (through > reflection) > > > > I hope that answers your question. Please let me know if you need > more > > information > > > > Thank > > Igor > > > > > > ----- Original Message ----- > > From: "Armando Stellato" <[email protected]> > > To: "UIMA" <[email protected]> > > Sent: Friday, January 16, 2009 1:19 PM > > Subject: annotator based on regular expressions over (previous) > annotations: > > state-of-work in UIMA? > > > > > > > Hi all, > > > > > > > > > > > > From a few posts, like the one at the following link: > > > > > > > > > > > > http://osdir.com/ml/apache.uima.general/2008-05/msg00070.html > > > > > > > > > > > > it seems that there is some interest in seeing such kind of > processor > in > > > the > > > UIMA array of available components. > > > > > > > > > > > > Since we're considering working on developing a new one, but would > prefer > > > not to reinvent the wheel J, I'm asking if there is already someone > doing > > > the same and, in case, get pointers to their work, know if it is > > > available, > > > if it's still in work-in-progress etc. > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Armando Stellato > > > > > > > > > > > > -------------------------------------------------- > > > > > > > > > > > > Ing. Armando Stellato, PhD > > > > > > AI Research Group, > > > > > > Dept. of Computer Science, Systems and Production > > > > > > University of Roma, Tor Vergata > > > > > > Via del Politecnico 1 00133 ROMA (ITALY) > > > > > > tel: +39 06 7259 7330 (office, room A1-14); > > > > > > +39 06 7259 7332 (lab) > > > > > > fax: +39 06 7259 7460 > > > > > > e_mail: [email protected] > > > > > > yahoo: stellato75 > > > > > > jabber(gtalk): [email protected] <mailto:[email protected]> > > > > > > skype: odnamar > > > > > > > > > > > > -------------------------------------------------- > > > > > > > > > > > >
