Hi, I'm doing some experiments with Kmeans and I have a few doubts regarding the way the cluster size is computed (related to other clustering algorithms as well).
1. AbstractCluster stores the number of points. It looks like that the method computeParameters() uses "s0" to determine the number of points. "s0" is computed based on the weight we assign for a point; the default is 1.0 so there's no problem. However, if we modify the weight then the number of points would be off; wouldn't it? is that intentional? 2. Regardless of (1), it seems that the cluster dumper does not always print the right number of points for a cluster. I didn't look into it too much yet, but my first guess would be that "numPoints" in AbstractCluster refers to the number of points in the cluster for the given iteration, which is what the dumper prints, while the actual number of points for a given cluster might change after the actual assignments of points to clusters are done. I will look into it further but if you have any pointers that would save me time. The ClusterLabels class computes the number of points in a cluster from the actual clusteredPoints directory and gets it right. Thanks! -- Yuval
