[ 
https://issues.apache.org/jira/browse/SOLR-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733757#comment-14733757
 ] 

Dawid Weiss commented on SOLR-8014:
-----------------------------------

https://github.com/carrotsearch/langid-java

There's also this one we ported from Python (a good while ago). It'd be 
interesting to see how they all compare in terms of model quality... :)

> Replace langdetect lib by more updated fork
> -------------------------------------------
>
>                 Key: SOLR-8014
>                 URL: https://issues.apache.org/jira/browse/SOLR-8014
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - LangId
>            Reporter: Jan Høydahl
>
> The language-detection library we use is 
> https://code.google.com/p/language-detection/ version 1.1 from 2012. The 
> project has stalled with no new development, not even in the [github 
> repo](https://github.com/shuyo/language-detection) the original author put up.
> Looks like the most promising fork is this one 
> https://github.com/optimaize/language-detector/ which is also being selected 
> by the Tika project to replace Tika's old detector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to