[jira] [Created] (OPENNLP-1736) NGramLanguageModel - Allow choice of smoothing/discounting algorithm

Nishant Shrivastava (Jira) Sun, 11 May 2025 13:02:07 -0700

Nishant Shrivastava created OPENNLP-1736:
--------------------------------------------


             Summary: NGramLanguageModel - Allow choice of 
smoothing/discounting algorithm
                 Key: OPENNLP-1736
                 URL: https://issues.apache.org/jira/browse/OPENNLP-1736
             Project: OpenNLP
          Issue Type: Wish
          Components: language model
    Affects Versions: 2.5.4
            Reporter: Nishant Shrivastava


Currently, NGramLanguageModel uses stupid backoff to deal with “zero 
probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986

It will be useful, if we can refactor it to pass a smoothing/discounting logic 
from outside.
This will allow us to add implementations of other smoothing/discounting 
techniques (e.g. below) in future.

https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing
https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OPENNLP-1736) NGramLanguageModel - Allow choice of smoothing/discounting algorithm

Reply via email to