[jira] [Commented] (OPENNLP-1267) Allow the LanguageDetector to stop before processing the full string

Tim Allison (JIRA) Fri, 07 Jun 2019 06:52:06 -0700


    [ 
https://issues.apache.org/jira/browse/OPENNLP-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858658#comment-16858658
 ]


Tim Allison commented on OPENNLP-1267:
--------------------------------------

Do you all want to create a separate LanguageDetector or should we bake this 
feature into the current LanguageDetectorME?

 

Would there be any objections to adding something like {{charsRead}} or 
{{ngramsComputed}} to the {{Language}} class?  Or somehow make this information 
available to users of the detector?

> Allow the LanguageDetector to stop before processing the full string
> --------------------------------------------------------------------
>
>                 Key: OPENNLP-1267
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1267
>             Project: OpenNLP
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Major
>
> On TIKA-2790, I found that Yalder is stopping after computing character 
> ngrams on roughly the first 60 characters.  That _likely_ explains its 
> impressive speed.  Let's make this "stopping short" feature available in 
> OpenNLP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (OPENNLP-1267) Allow the LanguageDetector to stop before processing the full string

Reply via email to