On 11/20/2012 01:33 PM, Jim foo.bar wrote:
also, the only information that I could find about the
*TokenClassFeatureGenerator* is this oddly phrased sentence:
_"Generates features for different for the class of the token."_
How does this generator work?
What 'class' does this refer to in a name-finding context? semantic
class? If we're looking for genes and drugs, would the classes be
"gene", "drug" & presumably "none"?
It assigns a category to a token based on the characters used in it, for
example:
- token is initial capital
- token is all upper case
- token is numeric
- token is alpha numeric
...
Have a look at the code to see all the classes and on which conditions
they are assigned.
Jörn