Okay.. Heres what I am trying to do.
My code is this :
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;
//import org.apache.lucene.util.Attribute;
import org.apache.mahout.common.FileLineIterable;
import org.apache.mahout.common.StringRecordIterator;
import org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater;
import
org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector;
import
org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter;
import
org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns;
import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth;
//import org.apache.mahout.math.map.OpenLongObjectHashMap;
import org.apache.mahout.common.Pair;
public class DellFPGrowth {
public static void main(String[] args) throws IOException {
Set<String> features = new HashSet<String>();
String input =
"/mnt/hgfs/Hadoop-automation/new-delltransaction.txt";
int minSupport = 1;
int maxHeapSize = 50;//top-k
String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
Charset encoding = Charset.forName("UTF-8");
FPGrowth<String> fp = new FPGrowth<String>();
String output = "/tmp/output.txt";
Path path = new Path(output);
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
Text.class, TopKStringPatterns.class);
fp.generateTopKFrequentPatterns(
new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
fp.generateFList(
new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
minSupport),
minSupport,
maxHeapSize,
features,
new StringOutputConverter(new
SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)),
new ContextStatusUpdater(null));
writer.close();
List<Pair<String,TopKStringPatterns>> frequentPatterns =
FPGrowth.readFrequentPattern(fs, conf, path);
for (Pair<String,TopKStringPatterns> entry : frequentPatterns) {
System.out.println(entry.getSecond());
}
System.out.print("\nthe end! ");
}
}
1. I am able to compile and run this code from eclipse, so I took the .class
file from eclipse target folder. Put it in some other directory and make a
simple jar file using jar -cvf command.
2. Since I am using mahout 0.4 and MAHOUT_CONF_DIR is default pointed to
$MAHOUT_HOME/conf so I just added my jar directly to $MAHOUT_HOME/conf
folder, added the entry of my class in drivers.classes.props file.
I added the following line at the end
com.musigma.hpc.CallFPGrowth = callfpgrowth : Calls fpgrowth
com.musigma.hpc.CallFPGrowth is my class that I want to run from cmd and its
in the jar.
3. Now when I am running bin/mahout, I am getting the following exception
hadoop@ubuntu:/tmp/mahout-distribution-0.4$ bin/mahout
Running on hadoop, using HADOOP_HOME=/usr/local/hadoop/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/hadoop/conf
11/09/25 00:40:07 WARN driver.MahoutDriver: Unable to add class:
com.musigma.hpc.CallFPGrowth
java.lang.ClassNotFoundException: com.musigma.hpc.CallFPGrowth
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:186)
at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:207)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
How can I resolve this issue ?
On Sat, Sep 24, 2011 at 2:55 PM, Lance Norskog <[email protected]> wrote:
> Ah! That is all off in Maven-land. There is a maven feature called "exec".
>
> http://mojo.codehaus.org/exec-maven-plugin/
>
> There are examples for this in the Mahout wiki. Search for "exec:java".
>
> On Sat, Sep 24, 2011 at 2:42 AM, praveenesh kumar <[email protected]
> >wrote:
>
> > Which mahout jars are required to run this code and where I can find them
> ?
> > I have this src downloaded .. but there are no jars in the src ?
> >
> >
> > On Sat, Sep 24, 2011 at 2:35 AM, Paritosh Ranjan <[email protected]>
> > wrote:
> >
> > > Just add the mahout jars in the class path while compiling/executing.
> > > Search "java jar in classpath" on google.
> > >
> > >
> > > On 24-09-2011 15:01, praveenesh kumar wrote:
> > >
> > >> I mean to say..
> > >>
> > >> I have this code ..
> > >>
> > >> import java.io.File;
> > >> import java.io.IOException;
> > >> import java.nio.charset.Charset;
> > >> import java.util.ArrayList;
> > >> import java.util.Arrays;
> > >> import java.util.Collection;
> > >> import java.util.HashSet;
> > >> import java.util.Map;
> > >> import java.util.Set;
> > >> import java.util.List;
> > >>
> > >> import org.apache.hadoop.conf.**Configuration;
> > >> import org.apache.hadoop.fs.**FileSystem;
> > >> import org.apache.hadoop.fs.Path;
> > >> import org.apache.hadoop.io.**SequenceFile;
> > >> import org.apache.hadoop.io.Text;
> > >> //import org.apache.lucene.util.**Attribute;
> > >> import org.apache.mahout.common.**FileLineIterable;
> > >> import org.apache.mahout.common.**StringRecordIterator;
> > >>
> > >> import org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > >> ContextStatusUpdater;
> > >> import
> > >> org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > >> SequenceFileOutputCollector;
> > >> import
> > >> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**
> > >> StringOutputConverter;
> > >>
> > >>
> > >>
> > >> import
> > >>
> > org.apache.mahout.fpm.**pfpgrowth.convertors.string.**TopKStringPatterns;
> > >> import org.apache.mahout.fpm.**pfpgrowth.fpgrowth.FPGrowth;
> > >> //import org.apache.mahout.math.map.**OpenLongObjectHashMap;
> > >>
> > >> import org.apache.mahout.common.Pair;
> > >>
> > >> public class DellFPGrowth {
> > >>
> > >> public static void main(String[] args) throws IOException {
> > >>
> > >> Set<String> features = new HashSet<String>();
> > >> String input =
> > >> "/mnt/hgfs/Hadoop-automation/**new-delltransaction.txt";
> > >> int minSupport = 1;
> > >> int maxHeapSize = 50;//top-k
> > >> String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
> > >> Charset encoding = Charset.forName("UTF-8");
> > >> FPGrowth<String> fp = new FPGrowth<String>();
> > >> String output = "/tmp/output.txt";
> > >> Path path = new Path(output);
> > >> Configuration conf = new Configuration();
> > >> FileSystem fs = FileSystem.get(conf);
> > >>
> > >>
> > >> SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
> > >> path,
> > >> Text.class, TopKStringPatterns.class);
> > >>
> > >>
> > >> fp.**generateTopKFrequentPatterns(
> > >> new StringRecordIterator(new FileLineIterable(new
> > >> File(input), encoding, false), pattern),
> > >> fp.generateFList(
> > >> new StringRecordIterator(new FileLineIterable(new
> > >> File(input), encoding, false), pattern),
> > >> minSupport),
> > >> minSupport,
> > >> maxHeapSize,
> > >> features,
> > >> new StringOutputConverter(new
> > >> SequenceFileOutputCollector<**Text,TopKStringPatterns>(**writer)),
> > >> new ContextStatusUpdater(null));
> > >>
> > >> writer.close();
> > >>
> > >> List<Pair<String,**TopKStringPatterns>> frequentPatterns =
> > >> FPGrowth.readFrequentPattern(**fs, conf, path);
> > >> for (Pair<String,**TopKStringPatterns> entry :
> > frequentPatterns)
> > >> {
> > >> System.out.println(entry.**getSecond());
> > >> }
> > >> System.out.print("\nthe end! ");
> > >> }
> > >>
> > >> }
> > >>
> > >>
> > >> How should I compile and run using command line..
> > >> I don't have eclipse on my system. How can I run this code ?
> > >>
> > >> Thanks,
> > >> Praveenesh
> > >>
> > >> On Sat, Sep 24, 2011 at 12:40 PM, Danny Bickson<danny.bickson@gmail.
> > **com<[email protected]>
> > >> >wrote:
> > >>
> > >> It is very simple: in the root folder you run (for example for
> > k-means:)
> > >>> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
> > >>> ~/usr7/small_netflix_mahout_**output/ --numClusters
> > >>> 10 -c ~/usr7/small_netflix_mahout/ -x 10
> > >>>
> > >>> where ./bin/mahout is used for any mahout application, and the next
> > >>> keyword
> > >>> (kmeans in this case) defines the algorithm type.
> > >>> The rest of the inputs are algorithm specific.
> > >>>
> > >>> If you want to add a new application to the existing ones, you need
> to
> > >>> edit
> > >>> conf/driver.classes.props
> > >>> file and point into your main class.
> > >>>
> > >>> Best,
> > >>>
> > >>> - Danny Bickson
> > >>>
> > >>> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar<
> [email protected]
> > >>>
> > >>>> wrote:
> > >>>> Hey,
> > >>>> I have this code written using mahout libraries. I am able to run
> the
> > >>>>
> > >>> code
> > >>>
> > >>>> from eclipse
> > >>>> How can I run the code written in mahout from command line ?
> > >>>>
> > >>>> My question is do I have to make a jar file and run it as hadoop jar
> > >>>> jarfilename.jar class
> > >>>> or shall I run it using simple java command ?
> > >>>>
> > >>>> Can anyone solve my confusion ?
> > >>>> I am not able to run this code.
> > >>>>
> > >>>> Thanks,
> > >>>> Praveenesh
> > >>>>
> > >>>>
> > >>
> > >> -----
> > >> No virus found in this message.
> > >> Checked by AVG - www.avg.com
> > >> Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date:
> 09/23/11
> > >>
> > >
> > >
> >
>
>
>
> --
> Lance Norskog
> [email protected]
>