Sorry for my English, can't express myself too well :-(

Basically I want to do this:
I have some canopy clusters as result of a canopy clustering pass. Now i want to generate a "centroids" folder containing just the centroids of these clusters.

Maybe it is too simple for anyone knowledgeable about mahout so it goes under the radar.

-----Ursprüngliche Mitteilung-----
Von: Ted Dunning <[email protected]>
An: user <[email protected]>
Verschickt: Do, 3 Jan 2013 5:13 pm
Betreff: Re: Seeding k-means with canopy clustering / Filter canopies


On Thu, Jan 3, 2013 at 8:08 AM, Stefan Kreuzer <[email protected]>wrote:

But even with a small weight (not sure how to apply that) i still
have the
wrong number of centroids, i.e. the wrong k?


I didn't think so.  I seem to be confused about what you want.


I imagined something like:

1. Do canopy clustering with clusterFilter param => retrieve a folder
with
x canopy clusters and a folder with x+n canopy centroids, where x
represents a good value for k.

2. Remove centroids that do not correspond with any of the canopy clusters.
3. Use these reduced set of canopy centroid as seed for k-means.


What about running a single k-means assignment pass where you assign the
x+n canopy centroids to each of the x clusters that you have?

Reply via email to