[ https://issues.apache.org/jira/browse/NUTCH-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney resolved NUTCH-960. ---------------------------------------- Resolution: Won't Fix This is way too old and as Ken pointed out this should be dealt with upstream in Tika. > Language ID - confidence factor > ------------------------------- > > Key: NUTCH-960 > URL: https://issues.apache.org/jira/browse/NUTCH-960 > Project: Nutch > Issue Type: Wish > Affects Versions: 1.2 > Reporter: M Alexander > > Hi > In JAVA implementation, what is the best way to calculate the confidence of > the outcome of the language id for a given text? > For example: > n-gram matching / total n-gram * 100. > when a text is passed. The outcome would be "en" with 89% confidence. What is > the best way to implement this to the existig nutch language id code? > Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira