[
https://issues.apache.org/jira/browse/OPENNLP-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591454#comment-16591454
]
ASF GitHub Bot commented on OPENNLP-1213:
-----------------------------------------
kojisekig closed pull request #328: OPENNLP-1213: Use ja for Japanese language
code rather than jp
URL: https://github.com/apache/opennlp/pull/328
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git
a/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java
b/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java
index 4a34229d7..b54c1590e 100644
--- a/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java
+++ b/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java
@@ -35,7 +35,7 @@
public static final char[] thEosCharacters = new char[] { ' ','\n' };
- public static final char[] jpEosCharacters = new char[] {'。', '!', '?'};
+ public static final char[] jpnEosCharacters = new char[] {'。', '!', '?'};
public EndOfSentenceScanner createEndOfSentenceScanner(String languageCode) {
@@ -72,8 +72,8 @@ public SDContextGenerator
createSentenceContextGenerator(String languageCode) {
return thEosCharacters;
} else if ("pt".equals(languageCode) || "por".equals(languageCode)) {
return ptEosCharacters;
- } else if ("jp".equals(languageCode) || "jpn".equals(languageCode)) {
- return jpEosCharacters;
+ } else if ("jpn".equals(languageCode)) {
+ return jpnEosCharacters;
}
return defaultEosCharacters;
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Use ja for Japanese language code rather than jp
> ------------------------------------------------
>
> Key: OPENNLP-1213
> URL: https://issues.apache.org/jira/browse/OPENNLP-1213
> Project: OpenNLP
> Issue Type: Bug
> Affects Versions: 1.9.0
> Reporter: Koji Sekiguchi
> Priority: Minor
> Fix For: 1.9.1
>
>
> It seems that Factory of sentdetect uses "jp" for Japanese language code but
> I think it is country code. Let's use "ja" instead.
> We could leave "jp" for back-compat, but I don't think we need to do it. So
> I'll just replace "jp" with "ja" in the patch.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)