[
https://issues.apache.org/jira/browse/SOLR-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733757#comment-14733757
]
Dawid Weiss commented on SOLR-8014:
-----------------------------------
https://github.com/carrotsearch/langid-java
There's also this one we ported from Python (a good while ago). It'd be
interesting to see how they all compare in terms of model quality... :)
> Replace langdetect lib by more updated fork
> -------------------------------------------
>
> Key: SOLR-8014
> URL: https://issues.apache.org/jira/browse/SOLR-8014
> Project: Solr
> Issue Type: Improvement
> Components: contrib - LangId
> Reporter: Jan Høydahl
>
> The language-detection library we use is
> https://code.google.com/p/language-detection/ version 1.1 from 2012. The
> project has stalled with no new development, not even in the [github
> repo](https://github.com/shuyo/language-detection) the original author put up.
> Looks like the most promising fork is this one
> https://github.com/optimaize/language-detector/ which is also being selected
> by the Tika project to replace Tika's old detector.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]