I tried deleting all the folders from the test and train data except for alt.atheism, but I get the identical error.
I might try debugging the problem in eclipse rather than from commandline, but Eclipse doesn't quite want to work either. On Mon, Jul 4, 2011 at 8:02 PM, Vijay Santhanam <vijay.santha...@gmail.com>wrote: > Thanks anyway Sergey. Could you perhaps upload your bayes-model folder so I > could try that out? > > > > On Mon, Jul 4, 2011 at 7:57 PM, Sergey Bartunov <sbos....@gmail.com>wrote: > >> Well, that's strange. Sorry, I can't help you at the moment, maybe >> someone else in the mailing list could. >> >> On 4 July 2011 13:49, Vijay Santhanam <vijay.santha...@gmail.com> wrote: >> > Hi Sergey, >> > >> > Yes, there were no errors. >> > >> > And all the model data seems to have been populated into bayes-model >> folder. >> > Also, each main folder in bayes-model has a _SUCESS file. >> > >> > See the tarball of my trained model here, >> > http://dl.dropbox.com/u/7881451/bayes-model.tar.gz >> > Please compare it to your trained model if possible, I would like to >> know if >> > it's different in any way. >> > >> > Perhaps it's corrupted in someway. >> > >> > Thanks, >> > Vijay >> > >> > >> > >> > On Mon, Jul 4, 2011 at 7:39 PM, Sergey Bartunov <sbos....@gmail.com> >> wrote: >> > >> >> Stop, did you _train_ the classifier successfully before running the >> >> _test_? >> >> >> >> On 4 July 2011 13:30, Vijay Santhanam <vijay.santha...@gmail.com> >> wrote: >> >> > Hi Sergey, >> >> > >> >> > I've tried using both the sh script file and following the >> instructions >> >> at >> >> > https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html - like you >> >> suggested. >> >> > Both return the same results. >> >> > >> >> > I've uploaded my bayes-test-input folder to dropbox, the first file >> is >> >> > here... >> >> > http://dl.dropbox.com/u/7881451/bayes-test-input/alt.atheism.txt >> >> > >> >> > Thanks, >> >> > Vijay >> >> > >> >> > On Mon, Jul 4, 2011 at 7:23 PM, Sergey Bartunov <sbos....@gmail.com> >> >> wrote: >> >> > >> >> >> Paste somewhere your bayes-test-input file. >> >> >> >> >> >> On 4 July 2011 13:20, Sergey Bartunov <sbos....@gmail.com> wrote: >> >> >> > Yes, I worked WITH hadoop, but there should be no difference. >> >> >> > >> >> >> > Why do you use examples/bin/build/20news-bayes.sh instead of >> direct >> >> >> > running bin/mahout? Is it the same? >> >> >> > >> >> >> > On 4 July 2011 13:12, Vijay Santhanam <vijay.santha...@gmail.com> >> >> wrote: >> >> >> >> Thanks Sergey, >> >> >> >> >> >> >> >> I'm still receiving the same error after following those steps. >> >> >> >> I've chosen not to use hadoop - does yours work WITH hadoop? >> >> >> >> >> >> >> >> A few bits of info that might be relevant. >> >> >> >> >> >> >> >> My examples/bin/work folder contains the expected folders from >> test >> >> data >> >> >> >> preparation and training... >> >> >> >> drwxr-xr-x@ 22 Vijay staff 748 18 Mar 2003 20news-bydate-test >> >> >> >> drwxr-xr-x@ 22 Vijay staff 748 18 Mar 2003 >> 20news-bydate-train >> >> >> >> drwxr-xr-x 3 Vijay staff 102 4 Jul 19:03 bayes-model >> >> >> >> drwxr-xr-x 22 Vijay staff 748 4 Jul 18:20 bayes-test-input >> >> >> >> drwxr-xr-x 22 Vijay staff 748 4 Jul 17:49 bayes-train-input >> >> >> >> >> >> >> >> >> >> >> >> I appreciate your help, do you have any other suggestions? >> >> >> >> >> >> >> >> Regards, >> >> >> >> Vijay >> >> >> >> >> >> >> >> >> >> >> >> On Mon, Jul 4, 2011 at 6:58 PM, Sergey Bartunov <sbos.net@ >> gmail.com> >> >> >> wrote: >> >> >> >> >> >> >> >>> When I started with Mahout I had the same errors. In my case, I >> just >> >> >> >>> didn't run PrepareTwentyNewsgroups. You may try to accurately >> repeat >> >> >> >>> all steps from >> >> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html >> >> >> >>> >> >> >> >>> On 4 July 2011 12:52, Vijay Santhanam < >> vijay.santha...@gmail.com> >> >> >> wrote: >> >> >> >>> > Hi All, >> >> >> >>> > >> >> >> >>> > I'm new to Mahout and I'm interested in experimenting with >> it's >> >> >> >>> classifiers. >> >> >> >>> > >> >> >> >>> > Right now, I'm just trying to get up and running with the >> demo's >> >> and >> >> >> >>> > examples. >> >> >> >>> > >> >> >> >>> > After checking out the mahout trunk, I've tried running the >> >> >> >>> classification >> >> >> >>> > example 20news, but after running the >> >> >> >>> ./examples/bin/build/20news-bayes.sh >> >> >> >>> > script I get the following error during the classification >> phase. >> >> >> >>> > >> >> >> >>> > Does anyone else get the same thing? Or have any >> recommendations >> >> >> about >> >> >> >>> how >> >> >> >>> > to fix it? >> >> >> >>> > I'd just like to get a sample classifier working before I >> embark >> >> on >> >> >> my >> >> >> >>> own >> >> >> >>> > classification journey. >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > INFO: Loading model from: >> >> >> >>> > {basePath=examples/bin/work/20news-bydate/bayes-model, >> >> >> >>> classifierType=bayes, >> >> >> >>> > alpha_i=1.0, dataSource=hdfs, gramSize=1, verbose=false, >> >> >> encoding=UTF-8, >> >> >> >>> > defaultCat=unknown, >> >> >> >>> > testDirPath=examples/bin/work/20news-bydate/bayes-test-input} >> >> >> >>> > Jul 4, 2011 6:28:25 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: Testing Bayes Classifier >> >> >> >>> > Jul 4, 2011 6:28:27 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: Read 50000 feature weights >> >> >> >>> > Jul 4, 2011 6:28:27 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: Read 100000 feature weights >> >> >> >>> > Jul 4, 2011 6:28:28 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: 193370.88331085522 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: rec.sport.baseball -129829.34738930278 531784.7805631821 >> >> >> >>> > -0.2441388925268003 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: sci.crypt -193023.42370049533 531784.7805631821 >> >> >> -0.3629728242618669 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: rec.sport.hockey -167853.6159738822 531784.7805631821 >> >> >> >>> > -0.31564200802459647 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: talk.politics.guns -203524.0148974065 531784.7805631821 >> >> >> >>> > -0.3827187658170024 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: soc.religion.christian -163900.9258713857 >> 531784.7805631821 >> >> >> >>> > -0.308209132457322 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: sci.electronics -142854.1677345925 531784.7805631821 >> >> >> >>> > -0.26863154598614886 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: comp.os.ms-windows.misc -531784.7805631821 >> 531784.7805631821 >> >> >> -1.0 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: misc.forsale -143454.70176448982 531784.7805631821 >> >> >> >>> > -0.26976082619845826 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: talk.religion.misc -139428.73484148504 531784.7805631821 >> >> >> >>> > -0.2621901565024562 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: alt.atheism -139569.06867597546 531784.7805631821 >> >> >> >>> -0.2624540486626301 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: comp.windows.x -178029.10523376046 531784.7805631821 >> >> >> >>> > -0.33477660839638973 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: talk.politics.mideast -193075.00789450994 >> 531784.7805631821 >> >> >> >>> > -0.36306982627452317 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: comp.sys.ibm.pc.hardware -138410.02049984262 >> >> 531784.7805631821 >> >> >> >>> > -0.2602745049477736 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: comp.sys.mac.hardware -125200.9927438868 >> 531784.7805631821 >> >> >> >>> > -0.23543545682389364 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: sci.space -192437.0009266271 531784.7805631821 >> >> >> -0.3618700797018455 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: rec.motorcycles -143142.20855440624 531784.7805631821 >> >> >> >>> > -0.26917319522159455 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: rec.autos -141800.97549909537 531784.7805631821 >> >> >> -0.2666510601317365 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: comp.graphics -166882.18654471825 531784.7805631821 >> >> >> >>> > -0.3138152738556811 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: talk.politics.misc -165196.84193278523 531784.7805631821 >> >> >> >>> > -0.3106460507535303 >> >> >> >>> > Jul 4, 2011 6:28:30 PM org.slf4j.impl.JCLLoggerAdapter info >> >> >> >>> > INFO: sci.med -192698.5183245711 531784.7805631821 >> >> >> -0.36236185270382393 >> >> >> >>> > Exception in thread "main" java.lang.IllegalArgumentException: >> >> Label >> >> >> not >> >> >> >>> > found: alt.atheism from >> >> >> >>> > at >> >> >> >>> > >> >> >> >> >> >> com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:93) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:113) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:117) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:85) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:67) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestClassifier.java:244) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:177) >> >> >> >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> >> >>> > at java.lang.reflect.Method.invoke(Method.java:597) >> >> >> >>> > at >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> >> >> >>> > at >> >> >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> >> >> >>> > at >> >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188) >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > Any help is great appreciated. >> >> >> >>> > >> >> >> >>> > Regards, >> >> >> >>> > -- >> >> >> >>> > Vijay Santhanam >> >> >> >>> > Software Engineer >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Vijay Santhanam >> >> >> >> Software Engineer >> >> >> >> http://au.linkedin.com/in/vijaysanthanam >> >> >> >> 0407525087 >> >> >> >> >> >> >> > >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Vijay Santhanam >> >> > Software Engineer >> >> > http://au.linkedin.com/in/vijaysanthanam >> >> > 0407525087 >> >> > >> >> >> > >> > >> > >> > -- >> > Vijay Santhanam >> > Software Engineer >> > http://au.linkedin.com/in/vijaysanthanam >> > 0407525087 >> > >> > > > > -- > Vijay Santhanam > Software Engineer > http://au.linkedin.com/in/vijaysanthanam > 0407525087 > -- Vijay Santhanam Software Engineer http://au.linkedin.com/in/vijaysanthanam 0407525087