As it turns out, I found some issues with the way perceptron output was normalized. It was sort of a strange way to handle negative numbers that didn't really work, so I changed it to exponentiation and then normalization.
Also, the training accuracies reported during perceptron training were much higher than final training accuracy, which turned out to be an artifact of the way training examples were ordered. I changed this so that after each iteration, the training accuracy is scored without changing the parameters. This gives a coherent value reported on every iteration, and it also allows early stopping by checking whether the same accuracy has been obtained for some number of times (e.g. 4) in a row. (This could also be done by checking that parameter values haven't changed, which would be better, but which I'd only want to do after refactoring.) I'm going to test the changes on a bunch of datasets this evening. If anyone else is using the perceptrons much, it would be good if they could do a before and after comparison. There are no API changes due to any of these fixes. Jason On Mon, Apr 11, 2011 at 2:37 AM, Jörn Kottmann <[email protected]> wrote: > On 4/10/11 8:39 PM, Jason Baldridge wrote: > >> Jorn, >> >> I'm making a homework on classification with maxent for my NLP class, and >> am >> making some fixes to ModelApplier in the process. Would you like me to >> commit those, or wait until after the release is out? Since it is an end >> application that nothing else depends on, this should just constitute a >> bug >> fix. >> > > Just to clarify, the bug fix is not modifying our API in anyway? > If so, please open a jira issue and commit the fix. > > I will wait with RC 6 until this change is in. > > Thanks, > Jörn > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com
