Hi, If you are mainly facing problems with ClassNotFound in Hadoop Environment, I would suggest you to put all the reqd (including mahout) jars in HADOOP_CLASSPATH in '$HADOOP_HOME/conf/hadoop-env.sh'. Also, while running the MR job, make sure that $HADOOP_HOME/conf exists in your classpath.
Regards Lokendra On Mon, Feb 21, 2011 at 9:50 PM, Zhengguo 'Mike' SUN <[email protected]>wrote: > Hi All, > > I was playing with the LanczosSolver class in Mahout. What I did is copying > the code in TestDistributedLanczosSolver.java and trying to run it in a > shared cluster. I also packaged core, core-test, math, math-test, and > mahout-collection 5 jars under the lib/ directory of my own jar. This new > jar worked correctly on my local machine under Hadoop's local mode. When I > submitted it to the cluster, I got ClassNotFoundException when running the > TimesSquaredJob. The stack trace is as follow: > > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector > at java.net.URLClassLoader$1.run(URLClassLoader.java:200) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:188) > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:252) > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247) > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:866) > at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71) > at > org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1613) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412) > at > org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > I also wrote a simple MapReduce job to test if I can access the Vector > class with some naive code like the following: > > Vector v = new DenseVector(100); > v.assign(3.14); > > This job worked fine in the cluster. Thus, it seemed that it is not the > problem to reference the Vector class. What could be wrong if it is not a > dependence problem? > > > >
