Hello,
I know that Mahout is used for batch processing, but I am interested if
I can use its KMeans, and how, for clustering individual points?
Let's say that we have following situation
* Global clustering, that performs batch processing on all data and
gives centroids as result
* One point clustering, that uses centroids from global clustering, to
assign that point to a cluster - it does not require cluster
centroid re-computation - just assigning that point to an existing
cluster
Can I do this using Mahout, or I have to implement it myself? I thought
setting number of iterations to 1, and in that way assign the point, but
the thing is, KMeans recomputes cluster centroids and if that new point
is an outlier, it makes a new cluster from it. I don't want that, I
actually want the distance to closest centroid.
For now, it seems that it is not very appropriate to use KMeans for
this, but it should be implemented separately... Is that correct?
Thanks