Hello there,
I use Mahout 0.6 for classification task I did the following steps:
1. Prepared the training file in the following format:
Label "Tab space" Content of the whole document1
Label "Tab space" Content of the whole document2
.
.
.
Label
2. Run the training command: bin/mahout trainclassifier -i /train -o /model
-type cbayes -ng 1 -source hdfs
3. Prepare the testing documents (some of training documents) in the
following format:
Label "Tab space" Content of the whole document
4. Run the testing command: bin/mahout testclassifier -d /test-data -m
/model -type cbayes -ng 1 -source hdfs -method sequential
output:
13/01/02 13:55:37 INFO bayes.TestClassifier: Testing Complementary Bayes
Classifier
13/01/02 13:55:38 INFO bayes.SequenceFileModelReader: 1986.5261271629715
13/01/02 13:55:39 INFO bayes.InMemoryBayesDatastore: Label NaN NaN NaN
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: ConfusionMatrix:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: Classified instances from doc1
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: ConfusionMatrix:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: Classified instances from doc2
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: ConfusionMatrix:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: Classified instances from doc3
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: ConfusionMatrix:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: Classified instances from doc4
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: ConfusionMatrix:
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
13/01/02 13:55:39 INFO bayes.TestClassifier: Classified instances from doc5
13/01/02 13:55:39 INFO bayes.TestClassifier:
=======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances : 0 0%
Incorrectly Classified Instances : 5 100%
Total Classified Instances : 5
=======================================================
Confusion Matrix
-------------------------------------------------------
a <--Classified as
0 | 0 a = Label
Why Mahout can not classify testing documents correctly ?
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/Mahout-classification-issue-tp4030226.html
Sent from the Mahout User List mailing list archive at Nabble.com.