danmuzi commented on a change in pull request #1296: LUCENE-9253: Support 
custom dictionaries in KoreanTokenizer
URL: https://github.com/apache/lucene-solr/pull/1296#discussion_r386122678
 
 

 ##########
 File path: 
lucene/analysis/nori/src/java/org/apache/lucene/analysis/ko/KoreanTokenizer.java
 ##########
 @@ -185,16 +185,43 @@ public KoreanTokenizer(AttributeFactory factory, 
UserDictionary userDictionary,
    * @param discardPunctuation true if punctuation tokens should be dropped 
from the output.
    */
   public KoreanTokenizer(AttributeFactory factory, UserDictionary 
userDictionary, DecompoundMode mode, boolean outputUnknownUnigrams, boolean 
discardPunctuation) {
+    this(factory,
+        TokenInfoDictionary.getInstance(),
+        UnknownDictionary.getInstance(),
+        ConnectionCosts.getInstance(),
+        userDictionary, mode, outputUnknownUnigrams, discardPunctuation);
+  }
+
+  /**
+   * <p>Create a new KoreanTokenizer supplying a custom system dictionary and 
unknown dictionary.
+   * This constructor provides an entry point for users that want to construct 
custom language models
+   * that can be used as input to {@link 
org.apache.lucene.analysis.ko.util.DictionaryBuilder}.</p>
+   *
+   * @param factory the AttributeFactory to use
+   * @param systemDictionary a custom known token dictionary
+   * @param unkDictionary a custom unknown token dictionary
+   * @param connectionCosts custom token transition costs
+   * @param userDictionary Optional: if non-null, user dictionary.
+   * @param mode Decompound mode.
+   * @param outputUnknownUnigrams If true outputs unigrams for unknown words.
 
 Review comment:
   Oh, I did that because it was capitalized before.
   I'll change the other constructors as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to