Hello, +1 1.7.0 in next release and +1 for a yearly release
Just to provide some info, the main changes in the lemmatizer have been: 1. Added a supervised statistical lemmatizer, usable from the CLI and API. The supervised lemmaitzer now provides a much better coverage for unknown words with respect to the previously existing dictionary-based one. 2. The lemmatizer component has been rewritten and the API therefore has substantially changed. Thus, the changes in the Dictionary-based lemmatizer are not backward compatible. In any case, I do not think that so many people was using it and the change at using the API is minor. The new statistical lemmatizer can support the Dictionary-based lemmatizers often used to provide features for components such as Word Sense Disambiguation, Opinion Mining/Sentiment Analysis, etc. In this regard, it will be nice to aim at working on the development of those two components for their release. Maybe the next release is too close, but definitely for the next one. Cheers, Rodrigo On Mon, Nov 7, 2016 at 7:01 PM, Russ, Daniel (NIH/CIT) [E] <dr...@mail.nih.gov> wrote: > Also the lemmatizer has significantly changed. I vote 1.7 > > On 11/7/16, 12:59 PM, "Joern Kottmann" <kottm...@gmail.com> wrote: > > Hello all, > > since our last release it has been a while and we received quite a few > changes which would be nice to get released. > > There are still some open Jira issues, but mostly smaller things that > can be wrapped up rather quickly. > > Is there anything important missing which should go into the next > release? Otherwise I think we should also aim for more frequent > released and just make one again early next year, with all the stuff we > might miss out now. > > We took in a patch - as part of OPENNLP-830 - to replace our self-made > hash table with the java.util.HashMap. This change is not backward > compatible for folks who extend AbstractModel. > > Should we go with 1.6.1 as a next version or should we make 1.7.0 to > reflect that? > > Previously we only had backward incompatible changes in versions which > bumped by the second number. Maybe that is better choice. It will > probably break some peoples code when they update. > > We also have lots of deprecated API still in OpenNLP, should we try to > remove as much as possible of it now? > > Jörn > >