Pe 12.11.2011 15:52, Frank Scholten a scris:
Hi Sachin,
Most Mahout jobs have several overloaded run methods. For example:
KMeansDriver.run(configuration, input, clustersIn, output, measure,
convergenceDelta, maxIterations, runClustering, runSequential)
Also, most of them extend AbstractJob and implement Hadoop's Tool
interface, so you can use Hadoops ToolRunner and create an array with
the arguments you would specify on the command line.
String[] kmeansArgs = new String[] {
"--input", inputPath,
"--output", outputPath,
"--numClusters", numClusters,
// More arguments
};
ToolRunner.run(configuration, new KMeansDriver(), kmeansArgs);
See
https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/common/AbstractJob.html
Frank
Hi Sachin,
Like Frank just mentioned, most Mahout command line tools are java
classes that have a main method that calls the actual job. For example
the ClusterDumper command calls the job like this:
public static void main(String[] args) throws Exception {
new ClusterDumper().run(args);
}
You can find out how to access the cluster by going over the source-code
in the ClusterDumper and other utilities.
http://svn.apache.org/viewvc/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java?view=markup