[ https://issues.apache.org/jira/browse/SOLR-11592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242989#comment-16242989 ]
Steve Rowe commented on SOLR-11592: ----------------------------------- Hi Koji, Looks good so far! In addition to testing, documentation is also needed ({{detecting-languages-during-indexing.adoc}}). For IntelliJ with this patch, the langid.iml needs to add a dependency on the {{analysis-common}} module: {noformat} + <orderEntry type="module" module-name="analysis-common" /> {noformat} About your TODO: {code:java} // TODO: not sure *100 is appropriate... languages.add(new DetectedLanguage(language.getLang(), language.getConfidence() * 100)); {code} {{DetectedLanguage.getCertainty()}} javadoc says: {code:java} /** * Returns the detected certainty for this language * @return certainty as a value between 0.0 and 1.0 where 1.0 is 100% certain */ {code} So I think {{*100}} is inappropriate. > add another language detector using OpenNLP > ------------------------------------------- > > Key: SOLR-11592 > URL: https://issues.apache.org/jira/browse/SOLR-11592 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - LangId > Affects Versions: 7.1 > Reporter: Koji Sekiguchi > Priority: Minor > Attachments: SOLR-11592.patch > > > We already have two language detectors, lang-detect and Tika's lang detect. > This is a ticket that gives users third option using OpenNLP. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org