Pe 12.11.2011 15:52, Frank Scholten a scris:
Hi Sachin,

Most Mahout jobs have several overloaded run methods. For example:

KMeansDriver.run(configuration, input, clustersIn, output, measure,
convergenceDelta,  maxIterations, runClustering,  runSequential)

Also, most of them extend AbstractJob and implement Hadoop's Tool
interface, so you can use Hadoops ToolRunner and create an array with
the arguments you would specify on the command line.

String[] kmeansArgs = new String[] {
   "--input", inputPath,
   "--output", outputPath,
   "--numClusters", numClusters,
   // More arguments
};

ToolRunner.run(configuration, new KMeansDriver(), kmeansArgs);

See 
https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/common/AbstractJob.html

Frank


Hi Sachin,

Like Frank just mentioned, most Mahout command line tools are java classes that have a main method that calls the actual job. For example the ClusterDumper command calls the job like this:

public static void main(String[] args) throws Exception {
        new ClusterDumper().run(args);
}

You can find out how to access the cluster by going over the source-code in the ClusterDumper and other utilities.

http://svn.apache.org/viewvc/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java?view=markup



Reply via email to