Re: RC - Performance test with public data

Jörn Kottmann Fri, 30 Sep 2011 07:02:22 -0700

On 9/30/11 5:36 AM, James Kosin wrote:

1.5.1 and 1.5.2-rc2 Differences:
------------------------------------------
* There is a change somewhere that fixes a bug with the sorting and
merging of events.  I don't know where just yet, but the output from the
training in
1.5.2:
     Computing event counts...  done. 203621 events
     Indexing...  done.
Sorting and merging events... done. Reduced 203621 events to 180416.
Done indexing.
Incorporating indexed data for training...
done.
     Number of Event Tokens: 180416
         Number of Outcomes: 3
       Number of Predicates: 58811
...done.


1.5.1:
     Computing event counts...  done. 203621 events
     Indexing...  done.
Sorting and merging events... done. Reduced 203621 events to 180018.
Done indexing.
Incorporating indexed data for training...
done.
     Number of Event Tokens: 180018
         Number of Outcomes: 3
       Number of Predicates: 58814
...done.

This is just one of many changes incorporated into 1.5.2....
With 1.5.2 some scores went up from 1.5.1 and some down.  Some a little
and some more dramatically.

The perceptron code was changed and updated. Jason fixed and improvedvarious things there.And I additionally added support for the training parameters file. Allthese changes should notaffect the performance of components which use maxent. If it does weneed to figure out why

and then maybe fix bugs.

The name finde was also updated to use the training params file, seeOPENNLP-195.

There are at least two issues which affect the performance of the namefinder, one where we

replaced the token class code with new unicode token class code, which

works more or less identical, but small differences are possible. Thatis OPENNLP-172.


And the evaluator fix in RC 2, where the adaptive data is cleared now.

Jörn

Re: RC - Performance test with public data

Reply via email to