Hi,
I am trying to run Bayes classifier as per steps given in
https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html
When I run testclassifier I am getting below error
$ bin/mahout testclassifier -m examples/bin/work/20news-bydate/bayes-model
-d examples/bin/work/20news-bydate/20news-bydate-test -type
bayes -ng 1 -source hdfs -method sequential
Running on hadoop, using HADOOP_HOME=C:\cygwin\home\Divya\hadoop-0.20.2
HADOOP_CONF_DIR=C:\cygwin\home\Divya\hadoop-0.20.2\conf
10/11/18 11:59:06 INFO bayes.TestClassifier: Loading model from:
{basePath=examples/bin/work/20news-bydate/bayes-model, classifierType=
bayes, alpha_i=1.0, dataSource=hdfs, gramSize=1, verbose=false,
encoding=UTF-8, defaultCat=unknown, testDirPath=examples/bin/work/20new
s-bydate/20news-bydate-test}
10/11/18 11:59:06 INFO bayes.TestClassifier: Testing Bayes Classifier
10/11/18 11:59:07 INFO io.SequenceFileModelReader:
file:/D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-model/trainer-weig
hts/Sigm
a_j/part-00000
10/11/18 11:59:07 INFO io.SequenceFileModelReader: Read 50000 feature
weights
10/11/18 11:59:07 INFO io.SequenceFileModelReader: Read 100000 feature
weights
10/11/18 11:59:08 INFO io.SequenceFileModelReader:
file:/D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-model/trainer-weig
hts/Sigm
a_k/part-00000
10/11/18 11:59:08 INFO io.SequenceFileModelReader:
file:/D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-model/trainer-weig
hts/Sigm
a_kSigma_j/part-00000
10/11/18 11:59:08 INFO io.SequenceFileModelReader: 190570.4012562479
10/11/18 11:59:08 INFO io.SequenceFileModelReader:
file:/D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-model/trainer-thet
aNormali
zer/part-00000
10/11/18 11:59:08 INFO io.SequenceFileModelReader:
file:/D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-model/trainer-tfId
f/traine
r-tfIdf/part-00000
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: rec.sport.baseball
-127395.14399316616 547567.2698760114 -0.232656608606305
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: sci.crypt
-189010.62350617323 547567.2698760114 -0.3451824714595741
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: rec.sport.hockey
-166203.2548335905 547567.2698760114 -0.3035302947731423
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: talk.politics.guns
-198793.14260997134 547567.2698760114 -0.3630478911841921
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore:
soc.religion.christian -158106.48187003663 547567.2698760114
-0.28874348517185
39
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: sci.electronics
-138650.82033374818 547567.2698760114 -0.25321239592195427
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore:
comp.os.ms-windows.misc -547567.2698760114 547567.2698760114 -1.0
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: misc.forsale
-141981.48005545404 547567.2698760114 -0.2592950453148956
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: talk.religion.misc
-134885.60852883724 547567.2698760114 -0.2463361416020722
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: alt.atheism
-134262.42728922528 547567.2698760114 -0.24519805086163576
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: comp.windows.x
-172513.19965389522 547567.2698760114 -0.3150538922696353
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore:
talk.politics.mideast -189368.63272082788 547567.2698760114
-0.345836289235672
6
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore:
comp.sys.ibm.pc.hardware -134535.56471897085 547567.2698760114
-0.245696870723
17975
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore:
comp.sys.mac.hardware -121323.6282757108 547567.2698760114
-0.2215684445551005
2
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: sci.space
-189203.04544769705 547567.2698760114 -0.3455338838834164
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: rec.motorcycles
-138625.2628242977 547567.2698760114 -0.25316572127418674
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: rec.autos
-136935.18434679657 547567.2698760114 -0.25007919917821886
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: comp.graphics
-161979.38306986375 547567.2698760114 -0.29581640828631267
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: talk.politics.misc
-159579.70032298338 547567.2698760114 -0.29143396455949216
10/11/18 11:59:10 INFO datastore.InMemoryBayesDatastore: sci.med
-183835.5334355675 547567.2698760114 -0.3357314133790253
Exception in thread "main" java.io.FileNotFoundException:
examples\bin\work\20news-bydate\20news-bydate-test\alt.atheism (Access is
den
ied)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at
org.apache.mahout.common.FileLineIterator.getFileInputStream(FileLineIterato
r.java:100)
at
org.apache.mahout.common.FileLineIterable.<init>(FileLineIterable.java:53)
at
org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestCla
ssifier.java:252)
at
org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:1
86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:68)
at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I have observed that test classifier command line takes test directory
bayes-test-input but I cannot find such directory
when I extract 20news-bydate.tar.gz I get only two directory
20news-bydate-test and 20news-bydate-train
Should I use 20news-bydate-test as input test directory or need create one
of my own ?
Thanks
Regards,
Divya