Not sure where I got the OpenNLP annotator from, but you can probably
google it to find it. The tagger in the sandbox that Thilo pointed out
is probably a good alternative. As to LanguageWare, that is a product,
not an open source project.
On Jul 16, 2008, at 2:20 PM, Ahmed Abdeen Hamed wrote:
Can you point me to the source for those UIMA annotators? I would
like to
use one of them for a really simple task. Thanks again!Ahmed
On Wed, Jul 16, 2008 at 1:56 PM, Michael Tanenblatt <[EMAIL PROTECTED]
>
wrote:
I have used the OpenNLP tagger as well as the IBM LanguageWare
product,
both of which are available as UIMA annotators.
On Jul 16, 2008, at 1:49 PM, Ahmed Abdeen Hamed wrote:
Thanks Michael. I like the idea of attaching the POS to dictionary
terms.
What POS tagger are you using then? Is it the Stanford or
LingPipe? I
doubt
that UIMA has a native POS-tagger.Ahmed
On Wed, Jul 16, 2008 at 1:24 PM, Michael Tanenblatt <
[EMAIL PROTECTED]>
wrote:
ConceptMapper maps entries in the dictionary to new annotations
using the
AE descriptor parameters "AttributeList" and "FeatureList". From
the
comments in the descriptor:
AttributeList: List of attribute names for XML dictionary entry
record -
must correspond to FeatureList
FeatureList: List of feature names for CAS annotation - must
correspond
to
AttributeList
In other words, these are two parallel arrays mapping from the
attributes
in the dictionary entries to the new annotation features. So, if
your
dictionary entries had attributes named "POS_Tag", e.g.:
<token canonical="abdomen, nos" POS_Tag ="NN" >
<variant base="abdomen, nos" />
<variant base="abdomen" />
</token>
and the resultant annotations had the feature "PartOfSpeechTag",
the
parameter "AttributeList" (an array) would have "POS_Tag" at the
same
position (array index) as the parameter "FeatureList" would have
"PartOfSpeechTag".
One key pice of information: ConceptMapper does not do any POS
tagging,
it
only maps from the dictionary. In some cases, I have run a
tokenizer/POS-tagger, then use this technique to unconditionally
override
the computed POS tag in the token using the
TokenClassWriteBackFeatureNames
parameter. This allows any attributes from the dictionary to be
stuffed
back
into all of the matching tokens, which can sometimes be useful...
TokenClassWriteBackFeatureNames: names of features that should be
written
back to a token, such as a POS tag
On Jul 16, 2008, at 1:07 PM, Ahmed Abdeen Hamed wrote:
Hello,TokenAnnotation objects don't get fully populated with data
after
annotation. For instance, POS feature returns null when printing
out an
annotation object. Apparently, this feature needs to be set
while doing
the
annotation. How does ConceptMapper do the POS tagging? I
appreciate any
insights!
Best wishes,
Ahmed