Hi Dan,

              If we copy the jar containing the custom classes to the
MAHOUT_HOME/lib folder wont that work fine?
              Because at line 147 of mahout script it reads all jars under
lib folder and put into classpath.

              If this won't work prolly there should be some better way to
add the custom classes to classpath rather than users modifying the script
file.

Thanks,
Mahesh Balija,
Calsoft Labs.

On Tue, Feb 12, 2013 at 10:18 PM, Dan Filimon
<[email protected]>wrote:

> You need to add the JAR containing the distance measure you want to
> the classpath.
> By default the CLASSPATH is set in line 120 of the mahout script. (the
> script itself is in the bin/ folder of your Mahout installation).
>
> Sadly I don't think that scripts allows you to set the class path by
> default, but it should be a simple add.
> You can either:
> a. add the path to your JAR/class folder manually at line 120
> b. (the cleaner way) add a new variable called something like
> MAHOUT_EXTRA_CLASSPATH to line 120 which you can set to whatever you
> need.
>
> b. is a bit cleaner, but you need to modify the script anyway.
>
> Alternatively, if you dislike fudging with the script you can have a
> closer look at it and see that running 'mahout classpath' gives you
> the classpath it builds. Then you can run the hadoop script directly
> like in line 252 of the script and edit the HADOOP_CLASSPATH (see
> http://stackoverflow.com/questions/3799679/how-to-run-a-hadoop-program).
>
> This should really be better documented. Sorry you're having trouble!
>
> Good luck! :)
>
> On Tue, Feb 12, 2013 at 6:30 PM, Mihai Josan
> <[email protected]> wrote:
> > This is the error I receive:
> >
> > mahout kmeans -i /user/rhadoop/in/sequence/ \
> >>        -c  /user/rhadoop/out/canopy-centroids/clusters-0 \
> >>        -o  /user/rhadoop/out/clusters-out/ \
> >>        -x 10 \
> >>        -dm
> /home/rhadoop/projects/workspace/mahout_abac/target/classes/clustering/AbacDistanceMeasure.class
> >
> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > Running on hadoop, using /usr/lib/hadoop/bin/hadoop and
> HADOOP_CONF_DIR=/etc/hadoop/conf
> > MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.1.2-job.jar
> > 13/02/12 17:05:57 INFO common.AbstractJob: Command line arguments:
> {--clusters=[/user/rhadoop/out/canopy-centroids/clusters-0],
> --convergenceDelta=[0.5],
> --distanceMeasure=[/home/rhadoop/projects/workspace/mahout_abac/target/classes/clustering/AbacDistanceMeasure.class],
> --endPhase=[2147483647], --input=[/user/rhadoop/in/sequence/],
> --maxIter=[10], --method=[mapreduce],
> --output=[/user/rhadoop/out/clusters-out2/], --startPhase=[0],
> --tempDir=[temp]}
> > Exception in thread "main" java.lang.IllegalStateException:
> java.lang.ClassNotFoundException:
> /home/rhadoop/projects/workspace/mahout_abac/target/classes/clustering/AbacDistanceMeasure.class
> >         at
> org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:30)
> >         at
> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:92)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >         at
> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:49)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> >         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
> >         at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> > Caused by: java.lang.ClassNotFoundException:
> /home/rhadoop/projects/besmart/workspace/mahout_abac/target/classes/clustering/AbacDistanceMeasure.class
> >         at java.lang.Class.forName0(Native Method)
> >         at java.lang.Class.forName(Class.java:169)
> >         at
> org.apache.mahout.common.ClassUtils.instantiateAs(ClassUtils.java:28)
> >         ... 15 more
> >
> >
> > Is this the proper way to use the custom distance measure? or should I
> package the class? and how?
> >
> > Thank you in advance,
> > Mihai Josan
> >
> >> Are you getting any errors?
> >> Can you specify fully qualified class name of your distance measure
> (like
> >> com.xxx.MyDistanceMeasure) and check?
> >>
> >> Best,
> >> Mahesh Balija,
> >> Calsoft Labs.
> >>
> >>
> >> On Tue, Feb 12, 2013 at 2:28 PM, Mihai Josan <
> [email protected]>wrote:
> >>
> >> > Hello,
> >> >
> >> > Can you please tell me how can I use a custom made distance measure
> with
> >> > Mahout in command line?
> >> > I am trying to do a clusterizationusing this distance like:
> >> >
> >> > mahout kmeans -i in/sequence/ \
> >> >        -c  out/centroids/clusters-0 \
> >> >        -o  out/clusters-out/ \
> >> >        -x 10 \
> >> >        -dm MyDistanceMeasure \
> >> >        -ow
> >> >
> >> > Thank you in advance,
> >> > Mihai
> >> >
>

Reply via email to