Its working now, thanks for the fix. A download link for the data is available here:
https://cwiki.apache.org/MAHOUT/partial-implementation.html with a description on how to edit the original data so it can work with DF. But because I am using the Random Forests builder right now, I am also testing it. On Fri, Oct 15, 2010 at 10:45 PM, Drew Farris <[email protected]> wrote: > There was indeed a problem with the mahout core job -- it did not > include mahout-math in the lib directory. I've checked in a fix. > > Could you point me to some sample data in arff format I could use as > input to test the Random Forests code? Are the KDDTrain+.arff and > KDDTrain+.info something you can share publicly? > > Thx, > > Drew > > On Thu, Oct 14, 2010 at 11:59 PM, deneche abdelhakim <[email protected]> > wrote: >> I had issues with random forests. I was testing the following example: >> >> https://cwiki.apache.org/MAHOUT/partial-implementation.html >> >> When I run the following command: >> >> $HADOOP_HOME/bin/hadoop jar >> $MAHOUT_HOME/core/target/mahout-core-<VERSION>-job.jar >> org.apache.mahout.df.tools.Describe -p testdata/KDDTrain+.arff -f >> testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L >> >> I get a java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector >> >> The same command works fine with mahout-0.3. >> >> Inside mahout-core-0.3.job you can find the Vector class inside >> org/apache/mahout/mahout/, >> but I can't seem to find it inside mahout-core-0.4-SNAPSHOT-job.jar >> >> am I missing something ? >> >> On Fri, Oct 15, 2010 at 3:38 AM, Drew Farris <[email protected]> wrote: >>> Hi Deneche, Grant, >>> >>> There is an issue on jira related to this ( >>> https://issues.apache.org/jira/browse/MAHOUT-505) >>> >>> The long and short of it is that nexus has problems with the way we were >>> deploying artifacts that would prevent the jars for projects that produced >>> job jars being deployed correctly. The job jar would be available when >>> searching, but not the regular jar file. One way to work around this is to >>> move from *.job to -job.jar >>> >>> This also allows us to use the maven assembly mechanism to build the job >>> jars instead of using the ant build and maven build helper plug-in. There's >>> nothing wrong with the approach pre-505, the post-505 approach just achieves >>> the same goal with less configuration. >>> >>> As far as the Vector classes, the mahout-math jar is in the lib directory of >>> the new job jars and thus available on the classpath when jobs are run using >>> hadoop. >>> >>> Have you run into any issues using these new job jars? I've tested with the >>> build-reuters.sh script and run bayes training haven't experienced any >>> problems. >>> >>> Drew >>> >>> >>> On Oct 14, 2010 12:57 PM, "Grant Ingersoll" <[email protected]> wrote: >>> >> >
