[ https://issues.apache.org/jira/browse/OPENNLP-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suneel Marthi resolved OPENNLP-1267. ------------------------------------ Fix Version/s: 1.9.3 Resolution: Fixed > Allow the LanguageDetector to stop before processing the full string > -------------------------------------------------------------------- > > Key: OPENNLP-1267 > URL: https://issues.apache.org/jira/browse/OPENNLP-1267 > Project: OpenNLP > Issue Type: Improvement > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Major > Fix For: 1.9.3 > > > On TIKA-2790, I found that Yalder is stopping after computing character > ngrams on roughly the first 60 characters. That _likely_ explains its > impressive speed. Let's make this "stopping short" feature available in > OpenNLP. > > Ideally, the language detector wouldn't copy the full String, it wouldn't > normalize the full String, and it wouldn't compute ngrams on the full String. -- This message was sent by Atlassian Jira (v8.3.4#803005)