Thank you for your response, my problem is that : I have an external file that contains a list of persons names, for example :
adam smith lary page ... etc and I need to extract all persons names from others source (Text Documents), for example : "Lary Page is the creator of google and Adam Smith is an economist" The annotator shoul extract <Adam Smith> and <Lary Page> as person name. So what I can do ? Bests - Yassine 2007/2/28, Adam Lally <[EMAIL PROTECTED]>:
On 2/28/07, LASRI YASSINE <[EMAIL PROTECTED]> wrote: > Hello, > > I have create an annotator that extract all String beginning with a capital > (Accccc)letter and I want to use this annotator (in Aggregation) to extract > all Sentences containing 2 String all of them begin with capila letter > (Xaaaaa Ybbbbb) . > Hi, You will need to create a second annotator, which will take the results of your first annotator and do further processing on them. This approach is shown in the MeetingAnnotator example that is excercise 4 of the tutorial (see the Annotator & Analysis Engine Developer's Guide chapter in the documentation). Say your first annotator outputs FeatureStructures of the type CapitalizedWord. Your second annotator would get an iterator over CapitalizedWords, for example: jcas.getJFSIndexRepository().getAnnotationIndex(CapitalizedWord.type ).iterator() Then you iterate over the Capitalized Word annotations and for each pair of annotations you can could if they are adjacent in the document by seeing if the document text between them is all whitespace. If you find an adjacent pair of CapitalizedWords you can then create a new annotation of some other type that spans both CapitalizedWords. You then create an Aggregate Analysis Engine contains both of your annotators. The way to do this is shown in the tutorial as well. It wasn't clear to me from your question whether you also need to detect sentence boundaries in your document. If so you can you the example SimpleTokenAndSentenceAnnotator that comes with the SDK. Hope that helps, -Adam
