Hi all,
I am trying to properly understand all the built-in features of openNLP
but I'm having some trouble with some of them...
The maxent introduction page [1] mentions:
So, say you want to implement a program which uses maxent to find
names in a text., such as:
/He succeeds Terrence D. Daniels, formerly a W.R. Grace vice
chairman, who resigned./
If you are currently looking at the word /Terrence/ and are trying to
decide if it is a name or not, examples of the kinds of features you
might use are "previous=succeeds", "current=Terrence", "next=D.", and
"currentWordIsCapitalized". You might even add a feature that says
that "Terrence" was seen as a name before.
I am particularly interested in the last sentence: *" You might even add
a feature that says that "Terrence" was seen as a name before. "*
Does this refer to the "PreviousMapFeatureGenerator" ?
also, a while back I had asked about the *OutcomePriorFeatureGenerator*
and Jorn replied with this:
_it is there to measure the distribution of the outcome_
E.g. for the name it could be:
start 5%
cont 10%
other 85%
In a name-finding context, what does the above example mean? The outcome
is either TRUE or FALSE yes? So the name-finder either recognizes a name
or it doesn't. If Jorn had not shown this example I would understand
that this feature-generator calculates distributions for these 2 boolean
values...However, Jorn's example shows something different which I don't
understand...what is 'start', 'cont' & 'end'? How are these outcomes and
how does that help the name-finder?
thanks in advance...
Jim
[1] http://maxent.sourceforge.net/howto.html