Hi ladies and lords, elves, dwarves, trolls and everyone in between (or simply: Hi folks)!
I have posted a 1st release candidate for the Apache OpenNLP 2.3.2 release and it is ready for testing. In this release we fixed several bugs and upgraded some dependencies. In addition, we added abbreviation dictionaries for several languages. Moreover, we addressed a memory issue (OPENNLP-421) which occurs for large dictionaries due to String interning. Several new configuration options have been added to choose a strategy. Details can be found in the related Jira / PR. We switched the default onnx runtime dependency in opennlp-dl to the cpu variant. If you need to use the GPU accelerated version of onxx, you can use the newly added module opennlp-dl-gpu. Moreover, we fixed the CLI on the Windows plattform. Thank you to everyone who contributed to this release, including all of our users and the people who submitted bug reports, contributed code or documentation enhancements. The release was made using the OpenNLP release process, documented on the website: https://opennlp.apache.org/release.html Maven Repo: <repositories> <repository> <id>opennlp-2.3.2-rc1</id> <name>Testing OpenNLP 2.3.2 release candidate</name> <url> https://repository.apache.org/content/repositories/orgapacheopennlp-1036 </url> </repository> </repositories> Binaries & Source: https://dist.apache.org/repos/dist/dev/opennlp/opennlp-2.3.2-rc1/ Tag: https://github.com/apache/opennlp/releases/tag/opennlp-2.3.2 Release notes: Bug [OPENNLP-421] - Large dictionaries cause JVM OutOfMemoryError: PermGen due to String interning [OPENNLP-1163] - Sentence detector doesn't spot abbreviations next to punctuation [OPENNLP-1369] - NPE when serializing a TokenNameFinder model trained with POSTaggerNameFeatureGeneratorFactory [OPENNLP-1520] - Generated Java code for stemmers is broken, and should be re-generated [OPENNLP-1527] - OpenNLP CLI does not start on Windows [OPENNLP-1529] - OpenNLP Docker targets Java 11 Improvement [OPENNLP-1479] - Write better tests for pattern verification (tokenizers) [OPENNLP-1519] - Modified 3 tests in opennlp-tools to handle any iteration order [OPENNLP-1523] - Use the snowball-data set to write language-specific stemmer eval tests [OPENNLP-1525] - Improve TokenizerME to make use of abbreviations provided in TokenizerModel [OPENNLP-1526] - Add Spanish abbreviation dictionary [OPENNLP-1530] - Add Italian abbreviation dictionary [OPENNLP-1531] - Add Portuguese abbreviation dictionary [OPENNLP-1540] - Add French abbreviation dictionary [OPENNLP-1541] - Conduct cleanup in opennlp.tools.chunker package [OPENNLP-1543] - Add Polish abbreviation dictionary [OPENNLP-1544] - Update dependency jackson to version 2.16.1 Test [OPENNLP-1446] - Investigate why LeskEvaluatorTest and MFSEvaluatorTest fail while parsing 'EnglishLS.train' [OPENNLP-1532] - Conduct cleanup in existing test code Task [OPENNLP-1377] - Create link to ASF Slack #opennlp channel [OPENNLP-1438] - Fix Release Documentation regarding KEYS [OPENNLP-1516] - Provide a ONNX runtime GPU BOM for opennlp-dl [OPENNLP-1542] - Investigate and fix failing PT chunker evaluation tests Dependency upgrade [OPENNLP-1533] - Update test dependency junit to version 5.10.1 [OPENNLP-1534] - Update Maven plugin forbiddenapis to version 3.6 [OPENNLP-1535] - Update dependency log4j2 to version 2.22.1 [OPENNLP-1536] - Update dependency onnxruntime to version 1.16.3 [OPENNLP-1537] - Update Apache Parent POM to version 31 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12353945 The results of the eval tests for the aforementioned tag will be available here (in a few hours): https://ci-builds.apache.org/job/OpenNLP/job/eval-tests-releases/10/ Reminder: The up-2-date KEYS file for signature verification can be found here: https://dist.apache.org/repos/dist/release/opennlp/KEYS Please vote on releasing these packages as Apache OpenNLP 2.3.2. The vote is open for at least the next 72 hours. Only votes from OpenNLP PMC are binding, but everyone is welcome to check the release candidate and vote. The vote passes if at least three binding +1 votes are cast. Please VOTE [+1] go ship it [+0] meh, don't care [-1] stop, there is a ${showstopper} Thanks! Richard