2011/5/17 Jörn Kottmann <[email protected]>: > Hi all, > > I was wondering if we can do bug fixes which slightly decrease > the performance of existing models? > > In this case I am speaking about OPENNLP-172 which fixes the handling > of lower case sequences in of the token class feature. It detects a > lower case sequences when they contain only A to Z, but in other languages > are more letters like the German umlauts. > > This fix will decrease the recall of the existing spanish person ner model > by 2%, > should we apply it anyway for the next release? > > After retraining the recall goes up by 6%.
I am +1 for fixing bugs and providing retrained models for the next release. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
