[ 
https://issues.apache.org/jira/browse/OPENNLP-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Shrivastava updated OPENNLP-1736:
-----------------------------------------
    Description: 
Currently, NGramLanguageModel uses stupid backoff to deal with “zero 
probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986

It will be useful, if we can refactor it to pass a smoothing/discounting logic 
from outside.
This will allow us to add and use implementations of other 
smoothing/discounting techniques (e.g. below) in future.

[https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing]
[https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation]

  was:
Currently, NGramLanguageModel uses stupid backoff to deal with “zero 
probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986

It will be useful, if we can refactor it to pass a smoothing/discounting logic 
from outside.
This will allow us to add implementations of other smoothing/discounting 
techniques (e.g. below) in future.

https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing
https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation


> NGramLanguageModel - Allow choice of smoothing/discounting algorithm
> --------------------------------------------------------------------
>
>                 Key: OPENNLP-1736
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1736
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: language model
>    Affects Versions: 2.5.4
>            Reporter: Nishant Shrivastava
>            Priority: Minor
>
> Currently, NGramLanguageModel uses stupid backoff to deal with “zero 
> probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986
> It will be useful, if we can refactor it to pass a smoothing/discounting 
> logic from outside.
> This will allow us to add and use implementations of other 
> smoothing/discounting techniques (e.g. below) in future.
> [https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing]
> [https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to