[
https://issues.apache.org/jira/browse/OPENNLP-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775359#comment-17775359
]
ASF GitHub Bot commented on OPENNLP-793:
----------------------------------------
kinow commented on code in PR #554:
URL: https://github.com/apache/opennlp/pull/554#discussion_r1359845269
##########
opennlp-tools/src/main/java/opennlp/tools/sentdetect/SentenceDetectorFactory.java:
##########
@@ -112,20 +112,36 @@ public Map<String, String> createManifestEntries() {
return manifestEntries;
}
- public static SentenceDetectorFactory create(String subclassName,
- String languageCode, boolean useTokenEnd,
- Dictionary abbreviationDictionary, char[] eosCharacters)
- throws InvalidFormatException {
+ /**
+ * Instantiates a {@link SentenceDetectorFactory} via a given {@code
subclassName}.
+ *
+ * @param subclassName The class name used for instantiation. If {@code
null}, an
+ * instance of {@link SentenceDetectorFactory} will be
returned
+ * per default. Otherwise, the {@link ExtensionLoader}
mechanism
+ * is applied to load the requested {@code subclassName}.
+ * @param languageCode The ISO language code to be used. Must not be {@code
null}.
+ * @param useTokenEnd {@code true} if {@link #TOKEN_END_PROPERTY} shall be
set,
+ * {@code false} otherwise.
+ * @param abbrDictionary The {@link Dictionary} of abbriviations if
available;
Review Comment:
s/abbriviations/abbreviations
##########
opennlp-tools/src/main/java/opennlp/tools/sentdetect/SentenceDetectorFactory.java:
##########
@@ -184,6 +206,9 @@ public EndOfSentenceScanner getEndOfSentenceScanner() {
}
}
+ /**
+ * @return A {@link SDContextGenerator} instance, guaranteed to be not
{@code null}.
+ */
public SDContextGenerator getSDContextGenerator() {
Review Comment:
I wonder if at some point it'd be useful to annotate methods like this with
`@NotNull` and use some validator/static checker. Thanks for the docs though!
:+1:
> Change default behavior when the sentence model has an abbreviation dictionary
> ------------------------------------------------------------------------------
>
> Key: OPENNLP-793
> URL: https://issues.apache.org/jira/browse/OPENNLP-793
> Project: OpenNLP
> Issue Type: Improvement
> Components: Sentence Detector
> Reporter: Gustavo Knuppe
> Assignee: Martin Wiesner
> Priority: Major
> Labels: enhancement, patch
> Fix For: 2.3.1
>
> Attachments: SentenceDetectorME.patch
>
>
> When a sentece model has a abbreviation dictionary the default behavior of
> the SentenceDetectorME should deal with the abbreviations (even if the model
> is poorly trained).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)