Okay.. So its what I did
1. Used maven to create jar (*mvn jar:jar*). hadoop@ubuntu:/tmp/mahout-jar$ jar -tf mahout-fpgrowth-1.0-SNAPSHOT.jar META-INF/ META-INF/MANIFEST.MF com/ com/musigma/ com /musigma/hpc/ com/musigma/hpc/CallFPGrowth.class META-INF/maven/ META-INF/maven/com.musigma.hpc/ META-INF/maven/com.musigma.hpc/mahout-fpgrowth/pom.xml META-INF/maven/com.musigma.hpc/mahout-fpgrowth/pom.properties 2. This jar I have placed in /tmp/mahout-jar and added this folder in HADOOP_CLASSPATH in hadoop-env.sh ( *export HADOOP_CLASSPATH=/tmp/mahout-jar/*:$HADOOP_CLASSPATH)* 3. I have copied drivers.classes.props in /tmp/mahout-jar folder. 4. Copy drivers.classes.props file to /tmp/mahout-jar, made the entry of my class file in the end *com.musigma.hpc.CallFPGrowth = callfpgrowth : calling fpgrowth * 5. In terminal, I am using export command to set /tmp/mahout-jar to MAHOUT_CONF_DIR *export MAHOUT_CONF_DIR=/tmp/mahout-jar * 6. Now from /tmp/mahout-jar, I am calling my mahout code by *# mahout callfpgrowth * and its RUNNING...!!! :-) Alternatively, I was also able to run the code by this following command : *sudo java -classpath mahout-fpgrowth-1.0-SNAPSHOT.jar:/usr/local/hadoop/hadoop/hadoop-0.20.2-core.jar:/tmp/mahout-distribution-0.5/core/target/mahout-core-0.5-job.jar com.musigma.hpc.CallFPGrowth* Thanks everyone for your guidance, Praveenesh On Sun, Sep 25, 2011 at 3:24 PM, Ted Dunning <[email protected]> wrote: > A better workflow, in my opinion, is to make a separate maven project for > code that uses Mahout. See https://github.com/tdunning/Chapter-16 for an > example. > > Then you can simple compile, test and run your code using Maven or Eclipse > of IntelliJ. Moreover, mvn will handle jaring up your code and all the > dependencies that you want to include. > > If you need changes to Mahout behavior, pop open the Mahout source and use > maven again. Write tests to demonstrate the function you want and then use > maven install to push the mahout jar into your local repo. If your code on > the Mahout side is changing often, it probably ought to go into your work > project instead of inside Mahout anyway. > > On Sun, Sep 25, 2011 at 2:12 PM, Lance Norskog <[email protected]> wrote: > > > For development, you can put the source in the Mahout tree and get it > into > > your job jars with 'mvn install'. > > If you want your own independent source code, you can make a new Maven > > project that creates your job.jar. > > I do not do this until I am happy with how things work inside the Mahout > > source tree. > > > > On Sun, Sep 25, 2011 at 12:52 AM, praveenesh kumar <[email protected] > > >wrote: > > > > > Okay.. Heres what I am trying to do. > > > > > > My code is this : > > > > > > > > > import java.io.File; > > > import java.io.IOException; > > > import java.nio.charset.Charset; > > > import java.util.ArrayList; > > > import java.util.Arrays; > > > import java.util.Collection; > > > import java.util.HashSet; > > > import java.util.Map; > > > import java.util.Set; > > > import java.util.List; > > > > > > import org.apache.hadoop.conf.Configuration; > > > import org.apache.hadoop.fs.FileSystem; > > > import org.apache.hadoop.fs.Path; > > > import org.apache.hadoop.io.SequenceFile; > > > import org.apache.hadoop.io.Text; > > > //import org.apache.lucene.util.Attribute; > > > import org.apache.mahout.common.FileLineIterable; > > > import org.apache.mahout.common.StringRecordIterator; > > > > > > import > org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater; > > > import > > > org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector; > > > import > > > > org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter; > > > > > > > > > > > > import > > > org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns; > > > import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth; > > > //import org.apache.mahout.math.map.OpenLongObjectHashMap; > > > > > > import org.apache.mahout.common.Pair; > > > > > > public class DellFPGrowth { > > > > > > public static void main(String[] args) throws IOException { > > > > > > Set<String> features = new HashSet<String>(); > > > String input = > > > "/mnt/hgfs/Hadoop-automation/new-delltransaction.txt"; > > > int minSupport = 1; > > > int maxHeapSize = 50;//top-k > > > String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" "; > > > Charset encoding = Charset.forName("UTF-8"); > > > FPGrowth<String> fp = new FPGrowth<String>(); > > > String output = "/tmp/output.txt"; > > > Path path = new Path(output); > > > Configuration conf = new Configuration(); > > > FileSystem fs = FileSystem.get(conf); > > > > > > > > > SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, > > path, > > > Text.class, TopKStringPatterns.class); > > > > > > > > > fp.generateTopKFrequentPatterns( > > > new StringRecordIterator(new FileLineIterable(new > > > File(input), encoding, false), pattern), > > > fp.generateFList( > > > new StringRecordIterator(new FileLineIterable(new > > > File(input), encoding, false), pattern), > > > minSupport), > > > minSupport, > > > maxHeapSize, > > > features, > > > new StringOutputConverter(new > > > SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)), > > > new ContextStatusUpdater(null)); > > > > > > writer.close(); > > > > > > List<Pair<String,TopKStringPatterns>> frequentPatterns = > > > FPGrowth.readFrequentPattern(fs, conf, path); > > > for (Pair<String,TopKStringPatterns> entry : frequentPatterns) { > > > System.out.println(entry.getSecond()); > > > } > > > System.out.print("\nthe end! "); > > > } > > > > > > } > > > > > > > > > 1. I am able to compile and run this code from eclipse, so I took the > > > .class > > > file from eclipse target folder. Put it in some other directory and > make > > a > > > simple jar file using jar -cvf command. > > > > > > 2. Since I am using mahout 0.4 and MAHOUT_CONF_DIR is default pointed > to > > > $MAHOUT_HOME/conf so I just added my jar directly to $MAHOUT_HOME/conf > > > folder, added the entry of my class in drivers.classes.props file. > > > > > > I added the following line at the end > > > com.musigma.hpc.CallFPGrowth = callfpgrowth : Calls fpgrowth > > > > > > com.musigma.hpc.CallFPGrowth is my class that I want to run from cmd > and > > > its > > > in the jar. > > > > > > 3. Now when I am running bin/mahout, I am getting the following > exception > > > > > > hadoop@ubuntu:/tmp/mahout-distribution-0.4$ bin/mahout > > > > > > Running on hadoop, using HADOOP_HOME=/usr/local/hadoop/hadoop > > > HADOOP_CONF_DIR=/usr/local/hadoop/hadoop/conf > > > 11/09/25 00:40:07 WARN driver.MahoutDriver: Unable to add class: > > > com.musigma.hpc.CallFPGrowth > > > java.lang.ClassNotFoundException: com.musigma.hpc.CallFPGrowth > > > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) > > > at java.security.AccessController.doPrivileged(Native Method) > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > > > at java.lang.Class.forName0(Native Method) > > > at java.lang.Class.forName(Class.java:186) > > > at > org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:207) > > > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:117) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > > > > > at > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > > > at java.lang.reflect.Method.invoke(Method.java:616) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > > > > > > > How can I resolve this issue ? > > > > > > > > > On Sat, Sep 24, 2011 at 2:55 PM, Lance Norskog <[email protected]> > > wrote: > > > > > > > Ah! That is all off in Maven-land. There is a maven feature called > > > "exec". > > > > > > > > http://mojo.codehaus.org/exec-maven-plugin/ > > > > > > > > There are examples for this in the Mahout wiki. Search for > "exec:java". > > > > > > > > On Sat, Sep 24, 2011 at 2:42 AM, praveenesh kumar < > > [email protected] > > > > >wrote: > > > > > > > > > Which mahout jars are required to run this code and where I can > find > > > them > > > > ? > > > > > I have this src downloaded .. but there are no jars in the src ? > > > > > > > > > > > > > > > On Sat, Sep 24, 2011 at 2:35 AM, Paritosh Ranjan < > [email protected]> > > > > > wrote: > > > > > > > > > > > Just add the mahout jars in the class path while > > compiling/executing. > > > > > > Search "java jar in classpath" on google. > > > > > > > > > > > > > > > > > > On 24-09-2011 15:01, praveenesh kumar wrote: > > > > > > > > > > > >> I mean to say.. > > > > > >> > > > > > >> I have this code .. > > > > > >> > > > > > >> import java.io.File; > > > > > >> import java.io.IOException; > > > > > >> import java.nio.charset.Charset; > > > > > >> import java.util.ArrayList; > > > > > >> import java.util.Arrays; > > > > > >> import java.util.Collection; > > > > > >> import java.util.HashSet; > > > > > >> import java.util.Map; > > > > > >> import java.util.Set; > > > > > >> import java.util.List; > > > > > >> > > > > > >> import org.apache.hadoop.conf.**Configuration; > > > > > >> import org.apache.hadoop.fs.**FileSystem; > > > > > >> import org.apache.hadoop.fs.Path; > > > > > >> import org.apache.hadoop.io.**SequenceFile; > > > > > >> import org.apache.hadoop.io.Text; > > > > > >> //import org.apache.lucene.util.**Attribute; > > > > > >> import org.apache.mahout.common.**FileLineIterable; > > > > > >> import org.apache.mahout.common.**StringRecordIterator; > > > > > >> > > > > > >> import org.apache.mahout.fpm.**pfpgrowth.convertors.** > > > > > >> ContextStatusUpdater; > > > > > >> import > > > > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.** > > > > > >> SequenceFileOutputCollector; > > > > > >> import > > > > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.string.** > > > > > >> StringOutputConverter; > > > > > >> > > > > > >> > > > > > >> > > > > > >> import > > > > > >> > > > > > > > > > org.apache.mahout.fpm.**pfpgrowth.convertors.string.**TopKStringPatterns; > > > > > >> import org.apache.mahout.fpm.**pfpgrowth.fpgrowth.FPGrowth; > > > > > >> //import org.apache.mahout.math.map.**OpenLongObjectHashMap; > > > > > >> > > > > > >> import org.apache.mahout.common.Pair; > > > > > >> > > > > > >> public class DellFPGrowth { > > > > > >> > > > > > >> public static void main(String[] args) throws IOException { > > > > > >> > > > > > >> Set<String> features = new HashSet<String>(); > > > > > >> String input = > > > > > >> "/mnt/hgfs/Hadoop-automation/**new-delltransaction.txt"; > > > > > >> int minSupport = 1; > > > > > >> int maxHeapSize = 50;//top-k > > > > > >> String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" "; > > > > > >> Charset encoding = Charset.forName("UTF-8"); > > > > > >> FPGrowth<String> fp = new FPGrowth<String>(); > > > > > >> String output = "/tmp/output.txt"; > > > > > >> Path path = new Path(output); > > > > > >> Configuration conf = new Configuration(); > > > > > >> FileSystem fs = FileSystem.get(conf); > > > > > >> > > > > > >> > > > > > >> SequenceFile.Writer writer = new SequenceFile.Writer(fs, > > > conf, > > > > > >> path, > > > > > >> Text.class, TopKStringPatterns.class); > > > > > >> > > > > > >> > > > > > >> fp.**generateTopKFrequentPatterns( > > > > > >> new StringRecordIterator(new > FileLineIterable(new > > > > > >> File(input), encoding, false), pattern), > > > > > >> fp.generateFList( > > > > > >> new StringRecordIterator(new > > > FileLineIterable(new > > > > > >> File(input), encoding, false), pattern), > > > > > >> minSupport), > > > > > >> minSupport, > > > > > >> maxHeapSize, > > > > > >> features, > > > > > >> new StringOutputConverter(new > > > > > >> > SequenceFileOutputCollector<**Text,TopKStringPatterns>(**writer)), > > > > > >> new ContextStatusUpdater(null)); > > > > > >> > > > > > >> writer.close(); > > > > > >> > > > > > >> List<Pair<String,**TopKStringPatterns>> > frequentPatterns > > = > > > > > >> FPGrowth.readFrequentPattern(**fs, conf, path); > > > > > >> for (Pair<String,**TopKStringPatterns> entry : > > > > > frequentPatterns) > > > > > >> { > > > > > >> System.out.println(entry.**getSecond()); > > > > > >> } > > > > > >> System.out.print("\nthe end! "); > > > > > >> } > > > > > >> > > > > > >> } > > > > > >> > > > > > >> > > > > > >> How should I compile and run using command line.. > > > > > >> I don't have eclipse on my system. How can I run this code ? > > > > > >> > > > > > >> Thanks, > > > > > >> Praveenesh > > > > > >> > > > > > >> On Sat, Sep 24, 2011 at 12:40 PM, Danny > > Bickson<danny.bickson@gmail. > > > > > **com<[email protected]> > > > > > >> >wrote: > > > > > >> > > > > > >> It is very simple: in the root folder you run (for example for > > > > > k-means:) > > > > > >>> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o > > > > > >>> ~/usr7/small_netflix_mahout_**output/ --numClusters > > > > > >>> 10 -c ~/usr7/small_netflix_mahout/ -x 10 > > > > > >>> > > > > > >>> where ./bin/mahout is used for any mahout application, and the > > next > > > > > >>> keyword > > > > > >>> (kmeans in this case) defines the algorithm type. > > > > > >>> The rest of the inputs are algorithm specific. > > > > > >>> > > > > > >>> If you want to add a new application to the existing ones, you > > need > > > > to > > > > > >>> edit > > > > > >>> conf/driver.classes.props > > > > > >>> file and point into your main class. > > > > > >>> > > > > > >>> Best, > > > > > >>> > > > > > >>> - Danny Bickson > > > > > >>> > > > > > >>> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar< > > > > [email protected] > > > > > >>> > > > > > >>>> wrote: > > > > > >>>> Hey, > > > > > >>>> I have this code written using mahout libraries. I am able to > > run > > > > the > > > > > >>>> > > > > > >>> code > > > > > >>> > > > > > >>>> from eclipse > > > > > >>>> How can I run the code written in mahout from command line ? > > > > > >>>> > > > > > >>>> My question is do I have to make a jar file and run it as > hadoop > > > jar > > > > > >>>> jarfilename.jar class > > > > > >>>> or shall I run it using simple java command ? > > > > > >>>> > > > > > >>>> Can anyone solve my confusion ? > > > > > >>>> I am not able to run this code. > > > > > >>>> > > > > > >>>> Thanks, > > > > > >>>> Praveenesh > > > > > >>>> > > > > > >>>> > > > > > >> > > > > > >> ----- > > > > > >> No virus found in this message. > > > > > >> Checked by AVG - www.avg.com > > > > > >> Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date: > > > > 09/23/11 > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Lance Norskog > > > > [email protected] > > > > > > > > > > > > > > > -- > > Lance Norskog > > [email protected] > > >
