Hi, Where can I find info on getting started with Mahout(remote installation) and Eclipse?
Regards, Rajat Mihai Josan <mihai.jo...@iquestgroup.com> wrote: Hello, After I made the changes, I still get the Class not found exception. I created my project using maven and eclipse and the jar is generated from eclipse export jar. Do you have any other idea how to resolve this problem? mahout kmeans -i /user/rhadoop/mahout/abac-out/sequence \ -c /user/rhadoop/mahout/abac-out/canopy-centroids/clusters-0 \ -o /user/rhadoop/mahout/abac-out/clusters-out/ \ -x 10 \ -dm clustering.AbacDistanceMeasure \ -ow **how the $CLASSPATH looks after line 120: CLASSPATH /usr/lib/sqoop/postgresql-9.2-1002.jdbc4.jar:/etc/mahout/conf.dist:/usr/lib/mahout/lib/abacDistance.jar MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /usr/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar HADOOP_CLASSPATH /etc/mahout/conf.dist:/usr/lib/mahout/mahout-examples-*-job.jar:/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.1.2.jar ** the exec command: CMD: /usr/lib/hadoop/bin/hadoop jar /usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar org.apache.mahout.driver.MahoutDriver kmeans -i /user/rhadoop/mahout/abac-out/sequence -c /user/rhadoop/mahout/abac-out/canopy-centroids/clusters-0 -o /user/rhadoop/mahout/abac-out/clusters-out/ -x 10 -dm clustering.AbacDistanceMeasure -ow 13/02/13 17:59:08 INFO common.AbstractJob: Command line arguments: {--clusters=[/user/rhadoop/mahout/abac-out/canopy-centroids/clusters-0], --convergenceDelta=[0.5], --distanceMeasure=[clustering.AbacDistanceMeasure], --endPhase=[2147483647], --input=[/user/rhadoop/mahout/abac-out/sequence], --maxIter=[10], --method=[mapreduce], --output=[/user/rhadoop/mahout/abac-out/clusters-out/], --overwrite=null, --startPhase=[0], --tempDir=[temp]} Exception in thread "main" java.lang.IllegalStateException: java.lang.ClassNotFoundException: clustering.AbacDistanceMeasure at org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:30) at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:92) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:49) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.ClassNotFoundException: clustering.AbacDistanceMeasure at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:28) ... 15 more Thank you, Mihai Josan -----Original Message----- From: Dan Filimon [mailto:dangeorge.fili...@gmail.com] Sent: Wednesday, February 13, 2013 1:25 PM To: user@mahout.apache.org Subject: Re: how to use a custom distance measure with kmeans? Sure, that sounds like an ever better solution! I didn't read the entire script. :) On Wed, Feb 13, 2013 at 6:40 AM, Mahesh Balija <balijamahesh....@gmail.com> wrote: > Hi Dan, > > If we copy the jar containing the custom classes to the > MAHOUT_HOME/lib folder wont that work fine? > Because at line 147 of mahout script it reads all jars > under lib folder and put into classpath. > > If this won't work prolly there should be some better > way to add the custom classes to classpath rather than users modifying > the script file. > > Thanks, > Mahesh Balija, > Calsoft Labs. > > On Tue, Feb 12, 2013 at 10:18 PM, Dan Filimon > <dangeorge.fili...@gmail.com>wrote: > >> You need to add the JAR containing the distance measure you want to >> the classpath. >> By default the CLASSPATH is set in line 120 of the mahout script. >> (the script itself is in the bin/ folder of your Mahout installation). >> >> Sadly I don't think that scripts allows you to set the class path by >> default, but it should be a simple add. >> You can either: >> a. add the path to your JAR/class folder manually at line 120 b. (the >> cleaner way) add a new variable called something like >> MAHOUT_EXTRA_CLASSPATH to line 120 which you can set to whatever you >> need. >> >> b. is a bit cleaner, but you need to modify the script anyway. >> >> Alternatively, if you dislike fudging with the script you can have a >> closer look at it and see that running 'mahout classpath' gives you >> the classpath it builds. Then you can run the hadoop script directly >> like in line 252 of the script and edit the HADOOP_CLASSPATH (see >> http://stackoverflow.com/questions/3799679/how-to-run-a-hadoop-program). >> >> This should really be better documented. Sorry you're having trouble! >> >> Good luck! :) >> >> On Tue, Feb 12, 2013 at 6:30 PM, Mihai Josan >> <mihai.jo...@iquestgroup.com> wrote: >> > This is the error I receive: >> > >> > mahout kmeans -i /user/rhadoop/in/sequence/ \ >> >> -c /user/rhadoop/out/canopy-centroids/clusters-0 \ >> >> -o /user/rhadoop/out/clusters-out/ \ >> >> -x 10 \ >> >> -dm >> /home/rhadoop/projects/workspace/mahout_abac/target/classes/clusterin >> g/AbacDistanceMeasure.class >> > >> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >> > Running on hadoop, using /usr/lib/hadoop/bin/hadoop and >> HADOOP_CONF_DIR=/etc/hadoop/conf >> > MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar >> > 13/02/12 17:05:57 INFO common.AbstractJob: Command line arguments: >> {--clusters=[/user/rhadoop/out/canopy-centroids/clusters-0], >> --convergenceDelta=[0.5], >> --distanceMeasure=[/home/rhadoop/projects/workspace/mahout_abac/targe >> t/classes/clustering/AbacDistanceMeasure.class], >> --endPhase=[2147483647], --input=[/user/rhadoop/in/sequence/], >> --maxIter=[10], --method=[mapreduce], >> --output=[/user/rhadoop/out/clusters-out2/], --startPhase=[0], >> --tempDir=[temp]} >> > Exception in thread "main" java.lang.IllegalStateException: >> java.lang.ClassNotFoundException: >> /home/rhadoop/projects/workspace/mahout_abac/target/classes/clusterin >> g/AbacDistanceMeasure.class >> > at >> org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:30) >> > at >> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.jav >> a:92) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> > at >> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.ja >> va:49) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. >> java:39) >> > at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces >> sorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra >> mDriver.java:72) >> > at >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) >> > at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. >> java:39) >> > at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces >> sorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) >> > Caused by: java.lang.ClassNotFoundException: >> /home/rhadoop/projects/besmart/workspace/mahout_abac/target/classes/c >> lustering/AbacDistanceMeasure.class >> > at java.lang.Class.forName0(Native Method) >> > at java.lang.Class.forName(Class.java:169) >> > at >> org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:28) >> > ... 15 more >> > >> > >> > Is this the proper way to use the custom distance measure? or >> > should I >> package the class? and how? >> > >> > Thank you in advance, >> > Mihai Josan >> > >> >> Are you getting any errors? >> >> Can you specify fully qualified class name of your distance >> >> measure >> (like >> >> com.xxx.MyDistanceMeasure) and check? >> >> >> >> Best, >> >> Mahesh Balija, >> >> Calsoft Labs. >> >> >> >> >> >> On Tue, Feb 12, 2013 at 2:28 PM, Mihai Josan < >> mihai.jo...@iquestgroup.com>wrote: >> >> >> >> > Hello, >> >> > >> >> > Can you please tell me how can I use a custom made distance >> >> > measure >> with >> >> > Mahout in command line? >> >> > I am trying to do a clusterizationusing this distance like: >> >> > >> >> > mahout kmeans -i in/sequence/ \ >> >> > -c out/centroids/clusters-0 \ >> >> > -o out/clusters-out/ \ >> >> > -x 10 \ >> >> > -dm MyDistanceMeasure \ >> >> > -ow >> >> > >> >> > Thank you in advance, >> >> > Mihai >> >> > >> ________________________________ NOTE: This e-mail message is subject to the MTN Group disclaimer see http://www.mtn.co.za/SUPPORT/LEGAL/Pages/EmailDisclaimer.aspx