On Fri, 28 Aug 2009 11:15:22 -0700 Ted Dunning <[email protected]> wrote: > To cluster strings, you need to have a distance between "centroids" > and strings. The DP clustering stuff could handle this, but not the > rest of the clustering.
As an aside: It is possible to formulate k-means (probably canopy as well?) on centroids. In the current implementation, at least the reduce step would have to be modified. Isabel
