[ 
https://issues.apache.org/jira/browse/SOLR-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403190#comment-16403190
 ] 

Ryan Pedela commented on SOLR-8014:
-----------------------------------

Would OpenNLP be a better option? They recently released a [trained 
model|https://opennlp.apache.org/models.html] that supports 103 languages.

> Replace langdetect lib by more updated fork
> -------------------------------------------
>
>                 Key: SOLR-8014
>                 URL: https://issues.apache.org/jira/browse/SOLR-8014
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - LangId
>            Reporter: Jan Høydahl
>            Priority: Major
>
> The language-detection library we use is 
> https://code.google.com/p/language-detection/ version 1.1 from 2012. The 
> project has stalled with no new development, not even in the [github 
> repo](https://github.com/shuyo/language-detection) the original author put up.
> Looks like the most promising fork is this one 
> https://github.com/optimaize/language-detector/ which is also being selected 
> by the Tika project to replace Tika's old detector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to