Hi all,
 
I am playing with mahout in particular I am trying to get result from 
clustering algorithms as K-means.
I am using the Hadoop 1.2 implementation on a HDinsight cluster along with 
Mahout 0.9.
What I am trying to do is getting a set of synthetic data and trying to 
clustering.
What I am running from the hadoop command line is the following command:
 
hadoop jar %mahoutdir%\mahout-examples-0.9-job.jar 
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job --input 
/user/myuser/simulation --output /user/myuser/simulation-output -k 5 -t1 20 -t2 
50 -x 20 -ow
 
The Mapper and Reducer are apparently executed correctly but when I look at the 
results by running this command:
 
hadoop jar %mahoutdir%\mahout-examples-0.9-job.jar 
org.apache.mahout.driver.MahoutDriver clusterdump -i 
/user/myuser/simulation-output/clusters-5-final/ -of TEXT -o 
/user/myuser/output/simulation.txt
 
The result I got is a list of centroids, but this is not what I expect. I 
expect a set of cluster with all the data in.
I obviously making a mistake in some way, but I do not know how and where.
 
What am I doing wrong?
Why executing org.apache.mahout.clustering.syntheticcontrol.kmeans.Job I am not 
able to explicit the -cl option. If I do that I got an error.
Is there any other way to execute the k-means algorithm?
 
Thank you in advance for the help.
Regards,
Ernesto                                           

Reply via email to