On 9/30/11 5:36 AM, James Kosin wrote:
1.5.1 and 1.5.2-rc2 Differences:
------------------------------------------
* There is a change somewhere that fixes a bug with the sorting and
merging of events. I don't know where just yet, but the output from the
training in
1.5.2:
Computing event counts... done. 203621 events
Indexing... done.
Sorting and merging events... done. Reduced 203621 events to 180416.
Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 180416
Number of Outcomes: 3
Number of Predicates: 58811
...done.
1.5.1:
Computing event counts... done. 203621 events
Indexing... done.
Sorting and merging events... done. Reduced 203621 events to 180018.
Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 180018
Number of Outcomes: 3
Number of Predicates: 58814
...done.
This is just one of many changes incorporated into 1.5.2....
With 1.5.2 some scores went up from 1.5.1 and some down. Some a little
and some more dramatically.
The perceptron code was changed and updated. Jason fixed and improved
various things there.
And I additionally added support for the training parameters file. All
these changes should not
affect the performance of components which use maxent. If it does we
need to figure out why
and then maybe fix bugs.
The name finde was also updated to use the training params file, see
OPENNLP-195.
There are at least two issues which affect the performance of the name
finder, one where we
replaced the token class code with new unicode token class code, which
works more or less identical, but small differences are possible. That
is OPENNLP-172.
And the evaluator fix in RC 2, where the adaptive data is cleared now.
Jörn