[ https://issues.apache.org/jira/browse/OPENNLP-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nishant Shrivastava updated OPENNLP-1736: ----------------------------------------- Issue Type: Improvement (was: Wish) > NGramLanguageModel - Allow choice of smoothing/discounting algorithm > -------------------------------------------------------------------- > > Key: OPENNLP-1736 > URL: https://issues.apache.org/jira/browse/OPENNLP-1736 > Project: OpenNLP > Issue Type: Improvement > Components: language model > Affects Versions: 2.5.4 > Reporter: Nishant Shrivastava > Priority: Major > > Currently, NGramLanguageModel uses stupid backoff to deal with “zero > probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986 > It will be useful, if we can refactor it to pass a smoothing/discounting > logic from outside. > This will allow us to add implementations of other smoothing/discounting > techniques (e.g. below) in future. > https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing > https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation -- This message was sent by Atlassian Jira (v8.20.10#820010)