[
https://issues.apache.org/jira/browse/OPENNLP-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jörn Kottmann updated OPENNLP-199:
----------------------------------
Description:
- Changed the update to be the actual perceptron update: when a label
that is not the gold label is chosen for an event, the parameters
associated with that label are decremented, and the parameters
associated with the gold label are incremented. I checked this
empirically on several datasets, and it works better than the
previous update (and it involves fewer updates).
- stepsize is decreased by stepsize/1.05 on every iteration, ensuring
better stability toward the end of training. This is actually the
main reason that the training set accuracy obtained during parameter
update continued to be different from that computed when parameters
aren't updated. Now, the parameters don't jump as much in later
iterations, so things settle down and those two accuracies converge
if enough iterations are allowed.
- Training set accuracy is computed once per iteration.
- Training stops if the current training set accuracy changes less
than a given tolerance from the accuracies obtained in each of the
previous three iterations.
- Averaging is done differently than before. Rather than doing an
immediate update, parameters are simply accumulated after iterations
(this makes the code much easier to understand/maintain). Also, not
every iteration is used, as this tends to give to much weight to the
final iterations, which don't actually differ that much from one
another. I tried a few things and found a simple method that works
well: sum the parameters from the first 20 iterations and then sum
parameters from any further iterations that are perfect squares (25,
36, 49, etc). This gets a good (diverse) sample of parameters for
averaging since the distance between subsequent parameter sets gets
larger as the number of iterations gets bigger.
- Added ListEventStream to make a stream out of List<Event>
- Added some helper methods, e.g. maxIndex, to simplify the code in
the main algorithm.
- The training stats aren't shown for every iteration. Now it is just
the first 10 and then every 10th iteration after that.
- modelDistribution, params, evalParams and others are no longer class
variables. They have been pushed into the findParameters
method. Other variables could/should be made non-global too, but
leaving as is for now.
was:
- Changed the update to be the actual perceptron update: when a label
that is not the gold label is chosen for an event, the parameters
associated with that label are decremented, and the parameters
associated with the gold label are incremented. I checked this
empirically on several datasets, and it works better than the
previous update (and it involves fewer updates).
- stepsize is decreased by stepsize/1.05 on every iteration, ensuring
better stability toward the end of training. This is actually the
main reason that the training set accuracy obtained during parameter
update continued to be different from that computed when parameters
aren't updated. Now, the parameters don't jump as much in later
iterations, so things settle down and those two accuracies converge
if enough iterations are allowed.
- Training set accuracy is computed once per iteration.
- Training stops if the current training set accuracy changes less
than a given tolerance from the accuracies obtained in each of the
previous three iterations.
- Averaging is done differently than before. Rather than doing an
immediate update, parameters are simply accumulated after iterations
(this makes the code much easier to understand/maintain). Also, not
every iteration is used, as this tends to give to much weight to the
final iterations, which don't actually differ that much from one
another. I tried a few things and found a simple method that works
well: sum the parameters from the first 20 iterations and then sum
parameters from any further iterations that are perfect squares (25,
36, 49, etc). This gets a good (diverse) sample of parameters for
averaging since the distance between subsequent parameter sets gets
larger as the number of iterations gets bigger.
- Added prepositional phrase attachment dataset to
src/test/resources/data/ppa. This is done with permission from
Adwait Ratnarparkhi -- see the README for details.
- Created unit test to check perceptron training consistency, using
the prepositional phrase attachment data. It would be good to do the
same for maxent.
- Added ListEventStream to make a stream out of List<Event>
- Added some helper methods, e.g. maxIndex, to simplify the code in
the main algorithm.
- The training stats aren't shown for every iteration. Now it is just
the first 10 and then every 10th iteration after that.
- modelDistribution, params, evalParams and others are no longer class
variables. They have been pushed into the findParameters
method. Other variables could/should be made non-global too, but
leaving as is for now.
> Refactor the PerceptronTrainer class to address a couple of problems
> --------------------------------------------------------------------
>
> Key: OPENNLP-199
> URL: https://issues.apache.org/jira/browse/OPENNLP-199
> Project: OpenNLP
> Issue Type: Improvement
> Components: Maxent
> Affects Versions: maxent-3.0.1-incubating
> Reporter: Jörn Kottmann
> Assignee: Jason Baldridge
> Fix For: tools-1.5.2-incubating, maxent-3.0.2-incubating
>
>
> - Changed the update to be the actual perceptron update: when a label
> that is not the gold label is chosen for an event, the parameters
> associated with that label are decremented, and the parameters
> associated with the gold label are incremented. I checked this
> empirically on several datasets, and it works better than the
> previous update (and it involves fewer updates).
> - stepsize is decreased by stepsize/1.05 on every iteration, ensuring
> better stability toward the end of training. This is actually the
> main reason that the training set accuracy obtained during parameter
> update continued to be different from that computed when parameters
> aren't updated. Now, the parameters don't jump as much in later
> iterations, so things settle down and those two accuracies converge
> if enough iterations are allowed.
> - Training set accuracy is computed once per iteration.
> - Training stops if the current training set accuracy changes less
> than a given tolerance from the accuracies obtained in each of the
> previous three iterations.
> - Averaging is done differently than before. Rather than doing an
> immediate update, parameters are simply accumulated after iterations
> (this makes the code much easier to understand/maintain). Also, not
> every iteration is used, as this tends to give to much weight to the
> final iterations, which don't actually differ that much from one
> another. I tried a few things and found a simple method that works
> well: sum the parameters from the first 20 iterations and then sum
> parameters from any further iterations that are perfect squares (25,
> 36, 49, etc). This gets a good (diverse) sample of parameters for
> averaging since the distance between subsequent parameter sets gets
> larger as the number of iterations gets bigger.
> - Added ListEventStream to make a stream out of List<Event>
> - Added some helper methods, e.g. maxIndex, to simplify the code in
> the main algorithm.
> - The training stats aren't shown for every iteration. Now it is just
> the first 10 and then every 10th iteration after that.
> - modelDistribution, params, evalParams and others are no longer class
> variables. They have been pushed into the findParameters
> method. Other variables could/should be made non-global too, but
> leaving as is for now.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira