Hi Stuti, 

Yes they are in HDFS. 

I think I almost nailed down the problem. I checked my dataset. There is only 
one data with "EMI" label. So when I split the dataset into test and training 
set, I think it is not in training set. I suspect this might be a problem. To 
confirm, I am removing this data from my dataset and I am going to run the 
mahout commands again. 

Regards,
Anand.C

-----Original Message-----
From: Stuti Awasthi [mailto:[email protected]] 
Sent: Wednesday, May 15, 2013 10:56 AM
To: [email protected]
Subject: RE: java.lang.IllegalArgumentException: Label not found: EMI whi 
running mahout testnb

Hi ChandraMohan,

Yes, I am also looking at text classification using Mahout. I have also tried 
this link and it worked for me. Just a basic question, I hope you have your 
files in HDFS and not in local.
This was the first mistake I did in running 20 Newsgroup example.

Thanks
Stuti Awasthi

-----Original Message-----
From: Chandra Mohan, Ananda Vel Murugan [mailto:[email protected]] 
Sent: Wednesday, May 15, 2013 9:37 AM
To: [email protected]
Subject: RE: java.lang.IllegalArgumentException: Label not found: EMI whi 
running mahout testnb

Hi Stuti, 

Thanks for your response. Labels are present. I ran seqdumper as you have 
suggested and I could see the labels. 

I see that you are also into similar text classification effort as me. I am 
referring this link

http://chimpler.wordpress.com/2013/03/13/using-the-mahout-naive-bayes-classifier-to-automatically-classify-twitter-messages/

Do you have any other links or references?

Regards,
Anand.C

-----Original Message-----
From: Stuti Awasthi [mailto:[email protected]]
Sent: Tuesday, May 14, 2013 12:23 PM
To: [email protected]
Subject: RE: java.lang.IllegalArgumentException: Label not found: EMI whi 
running mahout testnb

Hi Chandra,

I think that your label is not created correctly. Check the file by using 
seqdumper to see if there are labels present in that .

Thanks
Stuti Awasthi

-----Original Message-----
From: Chandra Mohan, Ananda Vel Murugan [mailto:[email protected]]
Sent: Tuesday, May 14, 2013 10:10 AM
To: [email protected]
Subject: java.lang.IllegalArgumentException: Label not found: EMI whi running 
mahout testnb

Hi,

I am running a  Hadoop 1.0.2 cluster in pseudo distributed mode and my Mahout 
version is 0.7. I am trying to do Text classification using Mahout naïve bayes 
command.

I created sequence files using my custom java program and uploaded the seq file 
to HDFS.

I am running the following mahout commands

mahout seq2sparse -I my-seq -o my-vectors


mahout split -i my-vectors/tfidf-vectors --trainingOutput train-vectors 
--testOutput test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles 
-xm sequential



mahout trainnb -i train-vectors -el -li labelindex -o model -ow -c



mahout testnb -i train-vectors -m model -l labelindex -ow -o my-testing -c



mahout testnb -i test-vectors -m model -l labelindex -ow -o tweets-testing -c

When I run this final command, I am getting the following exception.

Exception in thread "main" java.lang.IllegalArgumentException: Label not found: 
EMI
        at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
        at 
org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:102)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:122)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:126)
        at 
org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:94)
        at 
org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:71)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.analyzeResults(TestNaiveBayesDriver.java:158)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.run(TestNaiveBayesDriver.java:124)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.main(TestNaiveBayesDriver.java:65)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)

Can someone help me in fixing this error?

Regards,
Anand.C


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or 
may contain viruses in transmission. The e mail and its contents (with or 
without referred errors) shall therefore not attach any liability on the 
originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the views or opinions of HCL or its 
affiliates. Any form of reproduction, dissemination, copying, disclosure, 
modification, distribution and / or publication of this message without the 
prior written consent of authorized representative of HCL is strictly 
prohibited. If you have received this email in error please delete it and 
notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Reply via email to