Jason,
can you give me a short update about these changes,
should we go ahead with the release, or is this something you
really need in it?
If not, I would build RC 6 and do the vote.
Thanks,
Jörn
On 4/11/11 2:25 PM, Jörn Kottmann wrote:
On 4/11/11 2:11 PM, Jason Baldridge wrote:
As it turns out, I found some issues with the way perceptron output
was normalized. It was sort of a strange way to handle negative
numbers that didn't really work, so I changed it to exponentiation
and then normalization.
Can you please open a jira for this issue, and maybe give us a
reference to the code?
Also, the training accuracies reported during perceptron training
were much higher than final training accuracy, which turned out to be
an artifact of the way training examples were ordered. I changed this
so that after each iteration, the training accuracy is scored without
changing the parameters. This gives a coherent value reported on
every iteration, and it also allows early stopping by checking
whether the same accuracy has been obtained for some number of times
(e.g. 4) in a row. (This could also be done by checking that
parameter values haven't changed, which would be better, but which
I'd only want to do after refactoring.)
Please also make a jira for this one.
I'm going to test the changes on a bunch of datasets this evening. If
anyone else is using the perceptrons much, it would be good if they
could do a before and after comparison.
We only use the perceptron for the POSTagger currently, we can re-run
the accuracy we get on
some training/test sets.
Jörn