[jira] [Updated] (OPENNLP-1736) NGramLanguageModel - Allow choice of smoothing/discounting algorithm

Nishant Shrivastava (Jira) Sun, 11 May 2025 13:07:26 -0700


     [ 
https://issues.apache.org/jira/browse/OPENNLP-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nishant Shrivastava updated OPENNLP-1736:
-----------------------------------------
    Issue Type: Improvement  (was: Wish)

> NGramLanguageModel - Allow choice of smoothing/discounting algorithm
> --------------------------------------------------------------------
>
>                 Key: OPENNLP-1736
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1736
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: language model
>    Affects Versions: 2.5.4
>            Reporter: Nishant Shrivastava
>            Priority: Major
>
> Currently, NGramLanguageModel uses stupid backoff to deal with “zero 
> probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986
> It will be useful, if we can refactor it to pass a smoothing/discounting 
> logic from outside.
> This will allow us to add implementations of other smoothing/discounting 
> techniques (e.g. below) in future.
> https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing
> https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OPENNLP-1736) NGramLanguageModel - Allow choice of smoothing/discounting algorithm

Reply via email to