Hi ALL, Has anyone tried Dunning's large scale k-means (https://github.com/tdunning/knn)? It looks pretty interesting.
It looks like it does not have a working map reduce version yet although the doc states the implementation is straight forward. If anyone tried that implementation, could you please share some performance numbers (e.g. size of data, running time, cluster quality)? I am curious how well this cluster algorithm does since it is only an approximation of the traditional kmeans. Are there any error boundary? -- Regards, Jiaan
