Re: Re : Question about mahout Describe

Yang Sun Wed, 07 Apr 2010 11:18:45 -0700

Thanks deneche, I will test it soon.


On Sat, Apr 3, 2010 at 12:21 PM, deneche abdelhakim <a_dene...@yahoo.fr>wrote:

> Hi,
>
> Just committed a new version of TestForest. If you add "-mr" to the command
> line it should launch a Hadoop Job to classify the data. This is a basic
> implementation that can't compute the confusion matrix, so using "-a" has no
> effect. This implementation is also not tested very well (being a work in
> progress), so if you want to test it, select a random subset of your test
> data and classify them using the sequential implementation (without using
> -mr) then compare the predictions with those of the distributed
> implementation, the results won't be exactly the same (due the random
> behavior of the classifier when it encounter ties) but 90% of the
> predictions should be the same.
>
> let me know what you think of it. I'm working on the confusion matrix, but
> it should take some time to finish
>
> --- En date de : Ven 26.3.10, Yang Sun <soushare....@gmail.com> a écrit :
>
> > De: Yang Sun <soushare....@gmail.com>
> > Objet: Question about mahout Describe
> > À: mahout-user@lucene.apache.org
> > Date: Vendredi 26 mars 2010, 22h16
>  > I was testing mahout recently. It
> > runs great on small testing datasets.
> > However, when I try to expand the dataset to a big dataset
> > directory, I got
> > the following error message:
> >
> > [localhost]$ hjar
> > examples/target/mahout-examples-0.4-SNAPSHOT.job
> > org.apache.mahout.df.mapreduce.TestForest -i
> > /user/fulltestdata/* -ds rf/
> > testdata.info -m rf-testmodel-5-100 -a -o
> > rf/fulltestprediction
> >
> > Exception in thread "main" java.io.IOException: Cannot open
> > filename
> > /user/fulltestdata/*
> >         at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1474)
> >         at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1465)
> >         at
> > org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:372)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
> >         at
> > org.apache.hadoop.fs.FileSystem.open(FileSystem.java:351)
> >         at
> > org.apache.mahout.df.mapreduce.TestForest.testForest(TestForest.java:190)
> >         at
> > org.apache.mahout.df.mapreduce.TestForest.run(TestForest.java:137)
> >         at
> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at
> > org.apache.mahout.df.mapreduce.TestForest.main(TestForest.java:228)
> >         at
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at
> > java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > My question is: can I use mahout on directories instead of
> > single files? and
> > how?
> >
> > Thanks,
> >
>
>
>
>

Re: Re : Question about mahout Describe

Reply via email to