This is an automated email from the ASF dual-hosted git repository.
mawiesne pushed a change to branch OPENNLP-855-Sentiment-from-text
in repository https://gitbox.apache.org/repos/asf/opennlp.git
omit 3616cb71 OPENNLP-855: New SentimentAnalysisParser
add fb7eaede OPENNLP-1609: Update junit5-system-exit to 2.0.2 (#669)
add f41857ca OPENNLP-1617: Update JUnit to 5.11.3 (#671)
add e2ce958e OPENNLP-1629 Update DownloadUtil to support more languages
via new UD models - updates opennlp-models dependency to 1.1 - adapts
DownloadUtil to work with ud-models-1.1 release, adding 18 new supported
languages - adapts JUnit tests accordingly to include the new model files -
replaces content of 'index.html' copy of the released model list (v 1.1.0) for
DownloadParserTest
add 0eb89800 Bump com.puppycrawl.tools:checkstyle from 10.18.2 to 10.19.0
add 3c8bd725 OPENNLP-1598 - Update documentation on how to use
ClassPathModelFinder (#673)
add 74c7d52a OPENNLP-1631 Convert existing ModelLoader tests to
integration tests - converts the three existing test classes to "IT" ending so
that they get executed during failsafe plugin phase - adds new 18 languages for
each model type (sent, pos, tokens) - fixes missing language checks for "nl"
(Dutch) in DownloadParserTest
add 92c986b6 Bump jackson.version from 2.18.0 to 2.18.1
add b473d08e OPENNLP-1633 - Remove dependency towards jackson-databind in
opennlp-dl module
add 03e9fe2d OPENNLP-1634 - Move OpenNLP Brat Annotator back to Sandbox
add f112107c OPENNLP-1548 - Cleanup handling of output streams in cmdline
tools
add a5b1b62e OPENNLP-1226 Training an NER model for dates with
'dd.mm.yyyy' as Date format - adds NameFinderMEWithDatesTest verifying German
(and English) date formats, via custom training data per language, as
reproducer for OpenNLP-1226 - adds RandomGermanNewsGenerator to generate
synthetic news corpora with dates annotated, typical for DE locale - adds
RandomEnglishNewsGenerator to generate synthetic news corpora with dates
annotated, typical for EN,US/UK/AUS locale - extracts cod [...]
add e6abe1c4 Bump onnxruntime.version from 1.19.2 to 1.20.0
add cb8f5d68 Bump com.puppycrawl.tools:checkstyle from 10.19.0 to 10.20.0
add cfb6c57c OPENNLP-1643 - Remove inconsistent Training Parameter
Definitions (#682)
add 637fd35d Removes outdated KEYS file from repository, as per ASF
requirements the KEYS file should be published in a project's distribution
directory - see:
https://infra.apache.org/release-distribution.html#sigs-and-sums
add 647cb9a4 OpenNLP 2.5.0 (#685)
add 3feb4209 Adjust Dependabot to only consider major version updates for
some dependencies (#687)
add 4d9dab9c OPENNLP-1646: Remove compiler warnings in AbstractDL and
inheriting classes (#688)
add 662936ea OPENNLP-1647: Update classgraph to 4.8.179 (#689)
add cff36bc3 OPENNLP-1644: Add missing opennlp-tools-models to
opennlp-distr (#686)
add 0cfb24e3 OPENNLP-1652 Add additional constructors for
ThreadSafePOSTaggerME (#690)
add ec09b7e4 OPENNLP-1653 Add thread-safe version of LemmatizerME (#691)
add 7b385368 OPENNLP-1654 Add thread-safe version of NameFinderME - adds
ThreadSafeNameFinderME - adds additional constructor to ThreadSafeTokenizerME &
ThreadSafeSentenceDetectorME to be consistent with ThreadSafePOSTaggerME -
improves existing JavaDoc along the path
add a238e18e OPENNLP-1650 Update DownloadUtil to use Models release 1.2 -
adapts DownloadUtil, related classes and tests towards Models 1.2 - updates
index.html in opennlp/tools/util to latest data Models 1.2 for
DownloadParserTest - introduces DownloadUtil.ModelType#LEMMATIZER as those are
now available - adds LemmatizerModelLoaderIT - extracts some cnp'ed strings to
constants - fixes broken JavaDoc in PerceptronTrainer along the path
add 374bee93 OPENNLP-1655: Add constructors in SentenceDetectorME and
TokenizerME to inject custom abbreviation dictionar (#694)
add 0475d9c6 OPENNLP-1657: Update log4j2 to 2.24.2 (#695)
add fcf7e3d7 OPENNLP-1658: Enhance JavaDoc of code in opennlp-tools-models
(#696)
add 41d9717d Avoid running CI twice. (#698)
add f10d9a7a OPENNLP-1649: Set OpenNLP Models Maven artifacts 1.2.0 in
opennlp-tools-models (#699)
add c86e1e7d OPENNLP-1659 - Enhancements for DownloadUtil
add 3378dea6 Ignore minor and patch versions of puppycrawl. In case we get
a major update, notify.
add b156a905 Set notification settings for OpenNLP (PR, Issues, etc)
Enable autolinking to JIRA if commit starts with OPENNLP-XXXX
add 3cf8b914 adjusts README.md in terms of social media channel badges,
removing Twitter/X, adding Bluesky and Slack (#703)
add 1d722007 OPENNLP-1660: Switch to pre-trained UD models in Dev Manual
(#702)
add 8b84a3e7 OPENNLP-1663 Add test for FileToByteArraySampleStream (#706)
add f4de6c23 OPENNLP-1662: Wrap thread-safe classes in try-with resources
in Eval test (#705)
add e91ceb17 OPENNLP-1661: Fix custom models being wiped from OpenNLP
user.home directory (#704)
add e6ef2b50 [maven-release-plugin] prepare release opennlp-2.5.1
add 1562f749 [maven-release-plugin] prepare for next development iteration
add 5310565f OPENNLP-1667: Add thread-safe version of ChunkerME (#708)
add a3a5e0fb OPENNLP-1668: Avoid multiple DecimalFormat instances in
AbstractModel (#710)
add 7d4b450f OPENNLP-1669: Improve JavaDoc of QN related classes (#709)
add 9ba57ce6 OPENNLP-1670: Disable releases for apache.snapshots repo
add fb3b3f7d OPENNLP-1671: Convert while loops with duplicated code to
do-while loops (#712)
add 3fd4fd1c OPENNLP-1673: Re-use static conversion methods in ArrayMath
(#714)
add 34ca5182 OPENNLP-1674: Make use of enhanced switch expression
introduced in Java 14 (#715)
add b2d954a3 OPENNLP-1672: Flip misordered assertEquals arguments in
several tests (#713)
add 2f2f631c OPENNLP-1675: Address ShellCheck warnings for shell scripts
(#716)
add 49678c37 OPENNLP-1677: Extend JavaDoc of POSTaggerME (#717)
add 5b846a30 OPENNLP-1679: Extend JavaDoc of SgmlParser (#719)
add 74486145 OPENNLP-1678: Add thread-safe version of LanguageDetectorME
(#718)
add 6b7d87b1 OPENNLP-1681: Update log4j2 to 2.24.3 (#723)
add 23dd6cee OPENNLP-1682: Update JUnit to 5.11.4 (#726)
add 3fd914f9 OPENNLP-1683: Update Uimaj to 3.6.0 (#725)
add ed2682cc OPENNLP-1447: Reenable Cmdline Tool execution tests (#720)
add 818a333f OPENNLP-1680: Update several Maven plugins to recent versions
(#729)
add 1a50db3c OpenNLP 2.5.2 (#730)
add b9f07123 OPENNLP-1684: Reduce creation of String instances in
BrownBigramFeatureGenerator (#731)
add b690315b OPENNLP-1685: Adapt bin.xml assembly descriptor to include
generated JavaDoc (#732)
add dc577498 OPENNLP-855: New SentimentAnalysisParser
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (3616cb71)
\
N -- N -- N refs/heads/OPENNLP-855-Sentiment-from-text (dc577498)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
No new revisions were added by this update.
Summary of changes:
.asf.yaml | 8 +
.github/dependabot.yml | 5 +
.github/workflows/maven.yml | 11 +-
KEYS | 674 --
NOTICE | 6 -
README.md | 9 +-
opennlp-brat-annotator/pom.xml | 124 -
.../src/main/bin/brat-annotation-service | 56 -
.../src/main/bin/brat-annotation-service.bat | 51 -
.../java/opennlp/bratann/NameFinderAnnService.java | 102 -
.../java/opennlp/bratann/NameFinderResource.java | 138 -
opennlp-distr/pom.xml | 316 +-
opennlp-distr/src/main/assembly/bin.xml | 276 +-
opennlp-distr/src/main/bin/opennlp | 20 +-
opennlp-dl-gpu/pom.xml | 2 +-
opennlp-dl/pom.xml | 14 +-
.../src/main/java/opennlp/dl/AbstractDL.java | 6 +-
.../dl/doccat/DocumentCategorizerConfig.java | 35 +-
.../opennlp/dl/doccat/DocumentCategorizerDL.java | 43 +-
.../dl/doccat/DocumentCategorizerConfigTest.java | 196 +
.../opennlp/dl/namefinder/NameFinderDLEval.java | 6 +-
opennlp-docs/pom.xml | 2 +-
opennlp-docs/src/docbkx/corpora.xml | 21 -
opennlp-docs/src/docbkx/langdetect.xml | 4 +-
opennlp-docs/src/docbkx/lemmatizer.xml | 116 +-
opennlp-docs/src/docbkx/model-loading.xml | 147 +
opennlp-docs/src/docbkx/opennlp.xml | 3 +-
opennlp-docs/src/docbkx/postagger.xml | 21 +-
opennlp-docs/src/docbkx/sentdetect.xml | 17 +-
opennlp-docs/src/docbkx/tokenizer.xml | 22 +-
opennlp-morfologik-addon/pom.xml | 2 +-
.../src/main/bin/morfologik-addon | 20 +-
opennlp-tools-models/pom.xml | 2 +-
.../tools/models/AbstractClassPathModelFinder.java | 15 +-
.../java/opennlp/tools/models/ClassPathModel.java | 16 +
.../opennlp/tools/models/ClassPathModelEntry.java | 7 +
.../opennlp/tools/models/ClassPathModelFinder.java | 5 +-
.../opennlp/tools/models/ClassPathModelLoader.java | 4 +-
.../models/classgraph/ClassgraphModelFinder.java | 17 +-
.../models/simple/SimpleClassPathModelFinder.java | 49 +-
opennlp-tools/pom.xml | 36 +-
.../opennlp/tools/chunker/ThreadSafeChunkerME.java | 91 +
.../cmdline/doccat/DoccatCrossValidatorTool.java | 2 +-
.../tools/cmdline/doccat/DoccatEvaluatorTool.java | 2 +-
.../LanguageDetectorCrossValidatorTool.java | 2 +-
.../langdetect/LanguageDetectorEvaluatorTool.java | 2 +-
.../lemmatizer/LemmatizerEvaluatorTool.java | 2 +-
.../postag/POSTaggerCrossValidatorTool.java | 2 +-
.../cmdline/postag/POSTaggerEvaluatorTool.java | 2 +-
.../cmdline/tokenizer/TokenizerTrainerTool.java | 12 +-
.../tools/formats/Conll02NameSampleStream.java | 23 +-
.../tools/formats/EvalitaNameSampleStream.java | 23 +-
.../opennlp/tools/formats/ad/ADSentenceStream.java | 7 +-
.../conllu/ConlluLemmaSampleStreamFactory.java | 17 +-
.../conllu/ConlluPOSSampleStreamFactory.java | 17 +-
.../tools/formats/conllu/ConlluWordLine.java | 13 +-
.../tools/formats/muc/DocumentSplitterStream.java | 2 +-
.../java/opennlp/tools/formats/muc/SgmlParser.java | 92 +-
.../langdetect/ThreadSafeLanguageDetectorME.java | 84 +
.../ThreadSafeLemmatizerME.java} | 56 +-
.../ml/AbstractEventModelSequenceTrainer.java | 3 +-
.../opennlp/tools/ml/AbstractEventTrainer.java | 6 +-
.../java/opennlp/tools/ml/AbstractTrainer.java | 31 +-
.../main/java/opennlp/tools/ml/TrainerFactory.java | 25 +-
.../java/opennlp/tools/ml/maxent/GISTrainer.java | 4 +-
.../tools/ml/maxent/quasinewton/Function.java | 21 +
.../tools/ml/maxent/quasinewton/LineSearch.java | 140 +-
.../ml/maxent/quasinewton/NegLogLikelihood.java | 4 +-
.../quasinewton/ParallelNegLogLikelihood.java | 9 +-
.../tools/ml/maxent/quasinewton/QNMinimizer.java | 72 +-
.../tools/ml/maxent/quasinewton/QNModel.java | 11 +-
.../tools/ml/maxent/quasinewton/QNTrainer.java | 38 +-
.../tools/ml/model/AbstractDataIndexer.java | 4 -
.../java/opennlp/tools/ml/model/AbstractModel.java | 23 +-
.../opennlp/tools/ml/model/DataIndexerFactory.java | 24 +-
.../opennlp/tools/ml/model/OnePassDataIndexer.java | 4 +-
.../opennlp/tools/ml/model/TwoPassDataIndexer.java | 4 +-
.../tools/ml/perceptron/PerceptronTrainer.java | 8 +-
.../SimplePerceptronSequenceTrainer.java | 5 +-
.../ThreadSafeNameFinderME.java} | 53 +-
.../java/opennlp/tools/postag/POSTaggerME.java | 74 +-
.../tools/postag/ThreadSafePOSTaggerME.java | 59 +-
.../sentdetect/DefaultSDContextGenerator.java | 12 +-
.../tools/sentdetect/SentenceDetectorME.java | 26 +-
.../sentdetect/ThreadSafeSentenceDetectorME.java | 51 +-
.../java/opennlp/tools/stemmer/PorterStemmer.java | 17 +-
.../tools/tokenize/ThreadSafeTokenizerME.java | 48 +-
.../java/opennlp/tools/tokenize/TokenizerME.java | 23 +-
.../tools/tokenize/lang/en/TokenSampleStream.java | 21 +-
.../main/java/opennlp/tools/util/DownloadUtil.java | 167 +-
.../opennlp/tools/util/TrainingParameters.java | 24 +-
.../featuregen/BrownBigramFeatureGenerator.java | 18 +-
.../opennlp/tools/AbstractModelLoaderTest.java | 11 +-
.../opennlp/tools/chunker/ChunkSampleTest.java | 2 +-
.../tools/chunker/ChunkerEvaluatorTest.java | 4 +-
.../tools/cmdline/TokenNameFinderToolTest.java | 105 +-
.../LemmatizerModelLoaderIT.java} | 27 +-
.../namefind/generator/AbstractNewsGenerator.java | 47 +
.../generator/RandomEnglishNewsGenerator.java | 272 +
.../generator/RandomGermanNewsGenerator.java | 275 +
...SModelLoaderTest.java => POSModelLoaderIT.java} | 17 +-
...lLoaderTest.java => SentenceModelLoaderIT.java} | 17 +-
...LoaderTest.java => TokenizerModelLoaderIT.java} | 17 +-
.../tokenizer/TokenizerTrainerToolTest.java | 109 +-
.../tools/doccat/DocumentCategorizerNBTest.java | 5 +-
.../opennlp/tools/doccat/DocumentSampleTest.java | 2 +-
.../opennlp/tools/eval/MultiThreadedToolsEval.java | 54 +-
.../tools/formats/brat/BratDocumentTest.java | 20 +-
.../convert/AbstractConvertTest.java} | 32 +-
.../FileToByteArraySampleStreamTest.java} | 21 +-
.../FileToStringSampleStreamTest.java} | 24 +-
.../masc/MascNamedEntitySampleStreamTest.java | 6 +-
.../formats/masc/MascPOSSampleStreamTest.java | 6 +-
.../formats/masc/MascSentenceSampleStreamTest.java | 6 +-
.../formats/masc/MascTokenSampleStreamTest.java | 6 +-
.../opennlp/tools/formats/muc/SgmlParserTest.java | 17 +-
.../langdetect/LanguageDetectorEvaluatorTest.java | 2 +-
.../tools/langdetect/LanguageSampleTest.java | 2 +-
.../opennlp/tools/langdetect/LanguageTest.java | 2 +-
.../opennlp/tools/lemmatizer/LemmaSampleTest.java | 2 +-
.../java/opennlp/tools/ml/TrainerFactoryTest.java | 4 +-
.../opennlp/tools/ml/maxent/GISIndexingTest.java | 23 +-
.../tools/ml/maxent/MaxentPrepAttachTest.java | 11 +-
.../tools/ml/maxent/RealValueModelTest.java | 3 +-
.../tools/ml/maxent/ScaleDoesntMatterTest.java | 3 +-
.../ml/maxent/io/RealValueFileEventStreamTest.java | 3 +-
.../maxent/quasinewton/NegLogLikelihoodTest.java | 3 +-
.../ml/maxent/quasinewton/QNMinimizerTest.java | 7 +-
.../ml/maxent/quasinewton/QNPrepAttachTest.java | 19 +-
.../tools/ml/maxent/quasinewton/QNTrainerTest.java | 3 +-
.../ml/model/OnePassRealValueDataIndexerTest.java | 6 +-
.../ml/naivebayes/NaiveBayesCorrectnessTest.java | 3 +-
.../naivebayes/NaiveBayesModelReadWriteTest.java | 3 +-
.../ml/naivebayes/NaiveBayesPrepAttachTest.java | 11 +-
.../NaiveBayesSerializedCorrectnessTest.java | 3 +-
.../ml/perceptron/PerceptronPrepAttachTest.java | 33 +-
.../tools/namefind/AbstractNameFinderTest.java | 96 +
.../opennlp/tools/namefind/NameFinderMETest.java | 161 +-
.../tools/namefind/NameFinderMEWithDatesTest.java | 237 +
.../opennlp/tools/namefind/NameSampleTest.java | 2 +-
.../tools/namefind/TokenNameFinderModelTest.java | 4 +-
.../opennlp/tools/parser/ParserEvaluatorTest.java | 6 +-
.../java/opennlp/tools/postag/POSSampleTest.java | 10 +-
.../java/opennlp/tools/postag/POSTaggerMETest.java | 83 +-
.../sentdetect/SentenceDetectorEvaluatorTest.java | 2 +-
.../tools/sentdetect/SentenceDetectorMEIT.java | 52 +-
.../tools/sentdetect/SentenceDetectorMETest.java | 60 +-
.../tools/sentdetect/SentenceSampleTest.java | 2 +-
.../opennlp/tools/stemmer/SnowballStemmerTest.java | 112 +-
.../opennlp/tools/tokenize/TokenSampleTest.java | 2 +-
.../opennlp/tools/util/DownloadParserTest.java | 126 +-
.../tools/util/DownloadUtilDownloadTwiceTest.java | 97 +
.../java/opennlp/tools/util/DownloadUtilTest.java | 136 +-
.../opennlp/tools/util/TrainingParametersTest.java | 8 +-
.../util/featuregen/GeneratorFactoryTest.java | 4 +-
opennlp-tools/src/test/resources/logback-test.xml | 40 +
.../namefind/RandomNewsWithGeneratedDates_DE.train | 10000 +++++++++++++++++++
.../namefind/RandomNewsWithGeneratedDates_EN.train | 10000 +++++++++++++++++++
.../tools/namefind/origin-training-data.txt | 13 +
.../test/resources/opennlp/tools/util/index.html | 603 +-
opennlp-uima/pom.xml | 4 +-
pom.xml | 49 +-
162 files changed, 24258 insertions(+), 3003 deletions(-)
delete mode 100644 KEYS
delete mode 100644 opennlp-brat-annotator/pom.xml
delete mode 100755 opennlp-brat-annotator/src/main/bin/brat-annotation-service
delete mode 100755
opennlp-brat-annotator/src/main/bin/brat-annotation-service.bat
delete mode 100644
opennlp-brat-annotator/src/main/java/opennlp/bratann/NameFinderAnnService.java
delete mode 100644
opennlp-brat-annotator/src/main/java/opennlp/bratann/NameFinderResource.java
create mode 100644
opennlp-dl/src/test/java/opennlp/dl/doccat/DocumentCategorizerConfigTest.java
create mode 100644 opennlp-docs/src/docbkx/model-loading.xml
create mode 100644
opennlp-tools/src/main/java/opennlp/tools/chunker/ThreadSafeChunkerME.java
create mode 100644
opennlp-tools/src/main/java/opennlp/tools/langdetect/ThreadSafeLanguageDetectorME.java
copy
opennlp-tools/src/main/java/opennlp/tools/{tokenize/ThreadSafeTokenizerME.java
=> lemmatizer/ThreadSafeLemmatizerME.java} (52%)
copy
opennlp-tools/src/main/java/opennlp/tools/{sentdetect/ThreadSafeSentenceDetectorME.java
=> namefind/ThreadSafeNameFinderME.java} (55%)
copy
opennlp-tools/src/test/java/opennlp/tools/cmdline/{tokenizer/TokenizerModelLoaderTest.java
=> lemmatizer/LemmatizerModelLoaderIT.java} (62%)
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/cmdline/namefind/generator/AbstractNewsGenerator.java
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/cmdline/namefind/generator/RandomEnglishNewsGenerator.java
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/cmdline/namefind/generator/RandomGermanNewsGenerator.java
rename
opennlp-tools/src/test/java/opennlp/tools/cmdline/postag/{POSModelLoaderTest.java
=> POSModelLoaderIT.java} (72%)
rename
opennlp-tools/src/test/java/opennlp/tools/cmdline/sentdetect/{SentenceModelLoaderTest.java
=> SentenceModelLoaderIT.java} (71%)
rename
opennlp-tools/src/test/java/opennlp/tools/cmdline/tokenizer/{TokenizerModelLoaderTest.java
=> TokenizerModelLoaderIT.java} (72%)
rename
opennlp-tools/src/test/java/opennlp/tools/{convert/FileToStringSampleStreamTest.java
=> formats/convert/AbstractConvertTest.java} (64%)
copy
opennlp-tools/src/test/java/opennlp/tools/formats/{muc/SgmlParserTest.java =>
convert/FileToByteArraySampleStreamTest.java} (63%)
copy
opennlp-tools/src/test/java/opennlp/tools/formats/{muc/SgmlParserTest.java =>
convert/FileToStringSampleStreamTest.java} (64%)
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/namefind/AbstractNameFinderTest.java
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/namefind/NameFinderMEWithDatesTest.java
create mode 100644
opennlp-tools/src/test/java/opennlp/tools/util/DownloadUtilDownloadTwiceTest.java
create mode 100644 opennlp-tools/src/test/resources/logback-test.xml
create mode 100644
opennlp-tools/src/test/resources/opennlp/tools/namefind/RandomNewsWithGeneratedDates_DE.train
create mode 100644
opennlp-tools/src/test/resources/opennlp/tools/namefind/RandomNewsWithGeneratedDates_EN.train
create mode 100644
opennlp-tools/src/test/resources/opennlp/tools/namefind/origin-training-data.txt