On Thu, Jan 3, 2013 at 8:08 AM, Stefan Kreuzer <[email protected]>wrote:

> But even with a small weight (not sure how to apply that) i still have the
> wrong number of centroids, i.e. the wrong k?
>

I didn't think so.  I seem to be confused about what you want.


> I imagined something like:
>
> 1. Do canopy clustering with clusterFilter param => retrieve a folder with
> x canopy clusters and a folder with x+n canopy centroids, where x
> represents a good value for k.
>
2. Remove centroids that do not correspond with any of the canopy clusters.
> 3. Use these reduced set of canopy centroid as seed for k-means.
>

What about running a single k-means assignment pass where you assign the
x+n canopy centroids to each of the x clusters that you have?

Reply via email to