This is all from memory:

In the default pipeline Noun phrases are turned into lookupWindows.

The LookupDesc*.xml file being used describes what type annotation is used for 
a lookup window. 
Sentence is (or at least was at one time) too big to be reasonable both for 
performance purposes and because it would cause words from unrelated parts of a 
sentence to be considered together too often.

LookupWindow annotations were designed to be the input to the dictionary lookup 
component.
The default pipeline creates LookupWindow annotations from noun phrases 
(including NP PP NP patterns) so that the verb phrases etc are not searched.

I believe that the overlap annotator is used in order to keep things modular 
and not be dependent on previous components - to not assume LookupWindows are 
non-overlapping. Overlapping ones get merged into a single LookupWindow so that 
when iterating through all LookupWindow annotations, cTAKES won't create 
multiple annotations for the same mention.

-- James


> -----Original Message-----
> From: [email protected]
> [mailto:ctakes-dev-return-847-
> [email protected]] On Behalf Of Coarr, Matt
> Sent: Wednesday, November 14, 2012 4:20 PM
> To: [email protected]
> Subject: what do they do? lookup window & overlap annotators
> 
> Does anyone have a quick description of what the lookup window annotator
> and the overlap annotator do?
> 
> I've been looking at the descriptor for LookupWindowAnnotator.xml and
> its two subcomponents (which use classes OverlapAnnotator and
> CopyAnnotator).
> 
> I'm trying to get a better grasp for what they do.
> 
> Thanks!
> Matt

Reply via email to