On Thu, Jan 3, 2013 at 8:08 AM, Stefan Kreuzer <[email protected]>wrote:
> But even with a small weight (not sure how to apply that) i still have the > wrong number of centroids, i.e. the wrong k? > I didn't think so. I seem to be confused about what you want. > I imagined something like: > > 1. Do canopy clustering with clusterFilter param => retrieve a folder with > x canopy clusters and a folder with x+n canopy centroids, where x > represents a good value for k. > 2. Remove centroids that do not correspond with any of the canopy clusters. > 3. Use these reduced set of canopy centroid as seed for k-means. > What about running a single k-means assignment pass where you assign the x+n canopy centroids to each of the x clusters that you have?
