----- Forwarded Message ----- From: Zhengguo 'Mike' SUN <zhengguo...@yahoo.com> To: mahout-user <u...@mahout.apache.org> Cc: mahout-dev <mahout-...@lucene.apache.org> Sent: Monday, February 21, 2011 11:20 AM Subject: LanczosSolver and ClassNotFoundException
Hi All, I was playing with the LanczosSolver class in Mahout. What I did is copying the code in TestDistributedLanczosSolver.java and trying to run it in a shared cluster. I also packaged core, core-test, math, math-test, and mahout-collection 5 jars under the lib/ directory of my own jar. This new jar worked correctly on my local machine under Hadoop's local mode. When I submitted it to the cluster, I got ClassNotFoundException when running the TimesSquaredJob. The stack trace is as follow: Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:866) at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71) at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1613) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412) at org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I also wrote a simple MapReduce job to test if I can access the Vector class with some naive code like the following: Vector v = new DenseVector(100); v.assign(3.14); This job worked fine in the cluster. Thus, it seemed that it is not the problem to reference the Vector class. What could be wrong if it is not a dependence problem?