[ https://issues.apache.org/jira/browse/SOLR-11592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328244#comment-16328244 ]
Steve Rowe commented on SOLR-11592: ----------------------------------- A note about model licensing: I intentionally did not include OpenNLP's pre-trained model in the patch, because the Leipzig corpora[1] were used to train the model. The Leipzig corpora's license is CC BY-NC 4.0[2], which is on Apache's Category B list[3]; I think this means the Solr project could redistribute the OpenNLP pre-trained model, but I am uncertain. For testing, a model is produced from a small subset of the same source data. I don't think we need to include licensing info for this test model derived from Leipzig corpora data, but I'm open to other perspectives. [1] Leipzig corpora: http://wortschatz.uni-leipzig.de/en/download/ [2] Leipzig corpora Terms of Usage: http://wortschatz.uni-leipzig.de/en/usage [3] Apache "Category B" 3rd party licenses: https://www.apache.org/legal/resolved.html#category-b > add another language detector using OpenNLP > ------------------------------------------- > > Key: SOLR-11592 > URL: https://issues.apache.org/jira/browse/SOLR-11592 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - LangId > Affects Versions: 7.1 > Reporter: Koji Sekiguchi > Priority: Minor > Attachments: SOLR-11592.patch, SOLR-11592.patch > > > We already have two language detectors, lang-detect and Tika's lang detect. > This is a ticket that gives users third option using OpenNLP. :) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org