ConceptMapper maps entries in the dictionary to new annotations using
the AE descriptor parameters "AttributeList" and "FeatureList". From
the comments in the descriptor:
AttributeList: List of attribute names for XML dictionary entry record
- must correspond to FeatureList
FeatureList: List of feature names for CAS annotation - must
correspond to AttributeList
In other words, these are two parallel arrays mapping from the
attributes in the dictionary entries to the new annotation features.
So, if your dictionary entries had attributes named "POS_Tag", e.g.:
<token canonical="abdomen, nos" POS_Tag ="NN" >
<variant base="abdomen, nos" />
<variant base="abdomen" />
</token>
and the resultant annotations had the feature "PartOfSpeechTag", the
parameter "AttributeList" (an array) would have "POS_Tag" at the same
position (array index) as the parameter "FeatureList" would have
"PartOfSpeechTag".
One key pice of information: ConceptMapper does not do any POS
tagging, it only maps from the dictionary. In some cases, I have run a
tokenizer/POS-tagger, then use this technique to unconditionally
override the computed POS tag in the token using the
TokenClassWriteBackFeatureNames parameter. This allows any attributes
from the dictionary to be stuffed back into all of the matching
tokens, which can sometimes be useful...
TokenClassWriteBackFeatureNames: names of features that should be
written back to a token, such as a POS tag
On Jul 16, 2008, at 1:07 PM, Ahmed Abdeen Hamed wrote:
Hello,TokenAnnotation objects don't get fully populated with data
after
annotation. For instance, POS feature returns null when printing out
an
annotation object. Apparently, this feature needs to be set while
doing the
annotation. How does ConceptMapper do the POS tagging? I appreciate
any
insights!
Best wishes,
Ahmed