On 2/28/07, LASRI YASSINE <[EMAIL PROTECTED]> wrote:
Thank you for your response, my problem is that :
I have an external file that contains a list of persons names, for example :
adam
smith
lary
page
... etc
and I need to extract all persons names from others source (Text Documents),
for example :
"Lary Page is the creator of google and Adam Smith is an economist"
The annotator shoul extract <Adam Smith> and <Lary Page> as person name. So
what I can do ?
I'm not sure I completely understand your scenario, but is it the case
that you've already written an Annotator that creates annotations over
the individual works in the list? So for example it would annotate
<Adam> and <Smith> as separate PersonName annotations?
If so, then I think the appraoach from my last mail would work. In a
second annotator, iterate over all the PersonName annotations. For
each two consecutive annotations a1 and a2, check if
documentText.substring(a1.getEnd(), a2.getBegin()) is all whitespace.
If so, create a new annotation (e.g. FullPersonName) spanning from
a1.getBegin() to a2.getEnd().
-Adam