[ 
https://issues.apache.org/jira/browse/MAHOUT-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Anil updated MAHOUT-60:
-----------------------------

    Attachment: MAHOUT-60.patch

This is the latest diff against the trunk

Changes:
*Added a Result Analyzer Class to generate Classification Statistics. Currently 
generates Confusion *Matrix and Percentage accuracy. It will be extended to 
include (Precison, Recall, RMSE , Relative Absolute Error, Kappa Statistic)
*All such instances extends a Summarizable Interface 

Before using this patch please use MAHOUT-9 (Implement MapReduce 
BayesianClassifier) patch and the instructions given above.

The number of reducers are limited to 1 at the moment. Will need to figure out 
a way to read intermediate result

You can directly run the TestTwentyNewsgroups from the dfs as follows

{noformat} 
$bin/hadoop jar <MAHOUT_HOME>/build/apache-mahout-0.1-dev-ex.jar 
org.apache.mahout.examples.classifiers.cbayes.TestTwentyNewsgroups -p 
20newsoutput/model -t work/20news-18828
{noformat} 





> Complementary Naive Bayes
> -------------------------
>
>                 Key: MAHOUT-60
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-60
>             Project: Mahout
>          Issue Type: Sub-task
>          Components: Classification
>            Reporter: Robin Anil
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: MAHOUT-60.patch, MAHOUT-60.patch
>
>
> The focus is to implement an improved text classifier based on this paper 
> http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to