What do your input vectors look like?
How many canopies did you get in clusters-0?

-----Original Message-----
From: eric skinner [mailto:[email protected]] 
Sent: Wednesday, August 10, 2011 8:33 AM
To: [email protected]
Subject: issues on Mahout clustering result using K-means

I ran the K-means clustering algorithm against a set of sequence files.
However, the generated result looks like this:

0 belongs to cluster 1.0: []

0 belongs to cluster 1.0: []

0 belongs to cluster 1.0: []

0 belongs to cluster 1.0: []

0 belongs to cluster 1.0: []

0 belongs to cluster 1.0: []

Would you like to let me know why I get this type of result? Is that because
of any specific parameter setting requirement or anything else?

The program I use is borrowed from NewsKMeansClustering.java, an example
given in chapter 9 of Mahout-in-Action.

The core clustering code in this program is

CanopyDriver.run(vectorsFolder, canopyCentroids, new
EuclideanDistanceMeasure(), 250,    120, false, false);

KMeansDriver.run(conf, vectorsFolder, new Path(canopyCentroids, "clusters-0"),
clusterOutput, new TanimotoDistanceMeasure(), 0.01, 20, true, false);

Reply via email to