[
https://issues.apache.org/jira/browse/MAHOUT-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Schelter updated MAHOUT-1391:
---------------------------------------
Resolution: Not a Problem
Status: Resolved (was: Patch Available)
If you have labels in your testset that are not in your trainingset, then your
setup is flawed and you should not run that test.
> Possibility to disable confusion matrix in naive bayes
> ------------------------------------------------------
>
> Key: MAHOUT-1391
> URL: https://issues.apache.org/jira/browse/MAHOUT-1391
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Affects Versions: 0.8
> Reporter: Mansur Iqbal
> Fix For: 1.0
>
> Attachments: MAHOUT-1391.patch
>
>
> Sometimes confusion matrix is to big and not really necessary.
> And there is another case for the possibility:
> If you split a dataset with many labels with random selection percent to
> testdataset and trainingdataset, it could happen, that there are
> classes/labels in testdata, which do not appear in the trainingdataset. By
> creating a model with the trainingdata the created labelindex does not
> include some labels from testdata. Therefore if you test on this model with
> the testdata, mahout tries to create a confusion matrix with the labels from
> testdata which are not included in the labelindex and throws an exception.
--
This message was sent by Atlassian JIRA
(v6.2#6252)