Hi, I am new to mahout and I tried to run the kmeans clustering using mahout , on a cloudera vm machine (having hadoop installed in it) , I tried to run it using the command :----
root@cloudera-vm:/map_reduce_samples# mahout kmeans -i hdfs://localhost/mahout_input/ip -o hdfs://localhost/mahout_output/output_kmeans_07_29/ -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure -cd 1.0 -c hdfs://localhost/mahout_input/centroids_07_29 -k 5 -x 5 It gives me the result folder / output folder in the following directory. I can see the individual cluster directories using the clusterdump, but I am unable to see the folder named ClusteredPoints - which gives a mapping between the points - original and the cluster Ids , am I missing something. This is how the output folder looks:- root@cloudera-vm:/map_reduce_samples# hadoop fs -ls /mahout_output/output_kmeans_07_29 Found 5 items drwxr-xr-x - root supergroup 0 2011-07-29 11:23 /mahout_output/output_kmeans_07_29/clusters-1 drwxr-xr-x - root supergroup 0 2011-07-29 11:23 /mahout_output/output_kmeans_07_29/clusters-2 drwxr-xr-x - root supergroup 0 2011-07-29 11:23 /mahout_output/output_kmeans_07_29/clusters-3 drwxr-xr-x - root supergroup 0 2011-07-29 11:23 /mahout_output/output_kmeans_07_29/clusters-4 drwxr-xr-x - root supergroup 0 2011-07-29 11:23 /mahout_output/output_kmeans_07_29/clusters-5 ps:- when I ran the java version of the cluster apples example from the book , it created 3 folders for the clusters and the ClusterePoints folder containing the mappings. Any help shall be greatly appreciated. Thanks and Regards, Abhik Banerjee
