[
https://issues.apache.org/jira/browse/SOLR-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403190#comment-16403190
]
Ryan Pedela commented on SOLR-8014:
-----------------------------------
Would OpenNLP be a better option? They recently released a [trained
model|https://opennlp.apache.org/models.html] that supports 103 languages.
> Replace langdetect lib by more updated fork
> -------------------------------------------
>
> Key: SOLR-8014
> URL: https://issues.apache.org/jira/browse/SOLR-8014
> Project: Solr
> Issue Type: Improvement
> Components: contrib - LangId
> Reporter: Jan Høydahl
> Priority: Major
>
> The language-detection library we use is
> https://code.google.com/p/language-detection/ version 1.1 from 2012. The
> project has stalled with no new development, not even in the [github
> repo](https://github.com/shuyo/language-detection) the original author put up.
> Looks like the most promising fork is this one
> https://github.com/optimaize/language-detector/ which is also being selected
> by the Tika project to replace Tika's old detector.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]