Re: How to run Mahout java code from commandline ?

Paritosh Ranjan Sat, 24 Sep 2011 02:35:56 -0700

Just add the mahout jars in the class path while compiling/executing.
Search "java jar in classpath" on google.


On 24-09-2011 15:01, praveenesh kumar wrote:

I mean to say..

I have this code ..

  import java.io.File;
  import java.io.IOException;
  import java.nio.charset.Charset;
  import java.util.ArrayList;
  import java.util.Arrays;
  import java.util.Collection;
  import java.util.HashSet;
  import java.util.Map;
  import java.util.Set;
  import java.util.List;

  import org.apache.hadoop.conf.Configuration;
  import org.apache.hadoop.fs.FileSystem;
  import org.apache.hadoop.fs.Path;
  import org.apache.hadoop.io.SequenceFile;
  import org.apache.hadoop.io.Text;
  //import org.apache.lucene.util.Attribute;
  import org.apache.mahout.common.FileLineIterable;
  import org.apache.mahout.common.StringRecordIterator;

  import org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater;
  import
org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector;
  import
org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter;



  import
org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns;
  import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth;
  //import org.apache.mahout.math.map.OpenLongObjectHashMap;

  import org.apache.mahout.common.Pair;

  public class DellFPGrowth {

     public static void main(String[] args) throws IOException {

         Set<String>  features = new HashSet<String>();
         String input =
"/mnt/hgfs/Hadoop-automation/new-delltransaction.txt";
         int minSupport = 1;
         int maxHeapSize = 50;//top-k
         String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
         Charset encoding = Charset.forName("UTF-8");
         FPGrowth<String>  fp = new FPGrowth<String>();
         String output = "/tmp/output.txt";
         Path path = new Path(output);
         Configuration conf = new Configuration();
         FileSystem fs = FileSystem.get(conf);


         SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
Text.class, TopKStringPatterns.class);


fp.generateTopKFrequentPatterns(
                 new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
                 fp.generateFList(
                     new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
                     minSupport),
                 minSupport,
                 maxHeapSize,
                 features,
                 new StringOutputConverter(new
SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)),
                 new ContextStatusUpdater(null));

         writer.close();

         List<Pair<String,TopKStringPatterns>>  frequentPatterns =
FPGrowth.readFrequentPattern(fs, conf, path);
         for (Pair<String,TopKStringPatterns>  entry : frequentPatterns) {
               System.out.println(entry.getSecond());
         }
         System.out.print("\nthe end! ");
     }

}


How should I compile and run using command line..
I don't have eclipse on my system. How can I run this code  ?

Thanks,
Praveenesh

On Sat, Sep 24, 2011 at 12:40 PM, Danny Bickson<[email protected]>wrote:

It is very simple: in the root folder you run (for example for k-means:)
./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
~/usr7/small_netflix_mahout_output/ --numClusters
10 -c ~/usr7/small_netflix_mahout/ -x 10

where ./bin/mahout is used for any mahout application, and the next keyword
(kmeans in this case) defines the algorithm type.
The rest of the inputs are algorithm specific.

If you want to add a new application to the existing ones, you need to edit
conf/driver.classes.props
file and point into your main class.

Best,

- Danny Bickson

On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar<[email protected]

wrote:
Hey,
I have this code written using mahout libraries. I am able to run the

code

from eclipse
How can I run the code written in mahout from command line ?

My question is do I have to make a jar file and run it as hadoop jar
jarfilename.jar class
or shall I run it using simple java command ?

Can anyone solve my confusion ?
I am not able to run this code.

Thanks,
Praveenesh



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date: 09/23/11

Re: How to run Mahout java code from commandline ?

Reply via email to