Hello,

this is my first email to the mahout-user-list.
I am trying to do some clustering with mahout and i have a question concerning 
the cluster-center and cluster-radius.

For testing, i clustered 10 points using the KMeansClusterer:

points:
 [13.000, 4455.000] 
 [13.000, 5101.000] 
 [13.000,   333.000] 
 [13.000, 3412.000] 
 [13.000,   823.000] 
 [13.000,   238.000]
 [13.000    951.000] 
 [  9.000,   311.000] 
 [  9.000,   970.000] 
 [10.000, 2885.000]

This is the method i am using:

clusters = KMeansClusterer.clusterPoints(points, initial_clusters, measure, 10, 
0.001);

initial_clusters are 2 random points of the points above, measure is 
EuclideanDistanceMeasure.


And this is the result of the converged clusters VL-0 and VL-1:

VL-0{n=6 c=[11.667, 604.333] r=[1.886, 315.059]}
VL-1{n=4 c=[12.250, 3963.250] r=[1.299, 866.428]}

If i understand this output right then n is the number of points that are 
assigned to the cluster. c is the cluster-center and r is the radius of the 
cluster.
So, every point belongs to either cluster 0 or cluster 1. Actually you can even 
guess what points belong to what cluster but i am confused by the calculated 
cluster-center and cluster-radius:
For example  [  9.000,   970.000] should belong to cluster 0, but   9.000 <  
9.781 [11.667 -1.886] and 970.000 > 919.392  [604.333 + 315.059].  The point is 
not in range of the cluster, it obviously does not belong to cluster 1 but all 
10 points are assigned to clusters. Can someone please tell me where the 
mistake is?


greetings, Immo







Reply via email to