[ 
https://issues.apache.org/jira/browse/MAHOUT-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289439#comment-13289439
 ] 

Robin Anil commented on MAHOUT-941:
-----------------------------------

 Complementary Results: 
=======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances          :      68210       97.9058%
Incorrectly Classified Instances        :       1459        2.0942%
Total Classified Instances              :      69669

=======================================================
Confusion Matrix
-------------------------------------------------------
a       b       <--Classified as
27615   756      |  28371       a     = commons.apache.org
703     40595    |  41298       b     = cocoon.apache.org

=======================================================
Statistics
-------------------------------------------------------
Kappa                                   :    -1.1483
Accuracy                                :    0.6522
Consistency (stdev of accuracy)         :    0.5052


I am seeing this. Why is accuracy 0.65 when its actually 0.987. Can you fix 
this issue.
                
> Improve ConfusionMatrix statistics
> ----------------------------------
>
>                 Key: MAHOUT-941
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-941
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>            Reporter: Lance Norskog
>            Assignee: Robin Anil
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: Bayes.zip, MAHOUT-941.patch, MAHOUT-941.patch, SGD.zip
>
>
> This patch adds more statistics to the ConfusionMatrix and RequestAnalyzer.
> # Add Kappa measure - a standard measure comparing a sample v.s. random 
> assignment.
> # Add mean & standard deviation of individual labels - assist in identifying 
> consistent mal-assignment v.s. high and low quality labels.
> Also, the SGD solver saves its model periodically to /tmp/news-groups-number. 
> This patch moves those captures to the model/ output directory. (These 
> intermediate models are interesting for tracking SGD incremental development.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to