Hi Yuval,
This is certainly not a planned nor a desirable behavior in 0.7. Running
"./bin/mahout org.apache.mahout.clustering.display.DisplayKMeans";
however plots the k-means clusters converging as designed. Each
iteration's clusters are written to a new clusters-i directory and I am
surprised by your observations. Can you be a bit more explicit about the
nature of the data you are clustering, the command line invocation and
how you have determined that the clusters are not converging?
Please feel free to open a JIRA and we can work through this.
Is anybody else in the user community experiencing anything similar to
this posting?
Jeff
On 7/26/12 10:42 AM, Yuval Feinstein wrote:
Hi.
I am trying to run clustering using Mahout 0.7.
I am clustering short text documents.
The general framework is first running Canopy clustering,
and later running kmeans using the Canopy centroids as a starting point.
This gave useful result in Mahout 0.6.
The canopy part works fine,
but kmeans in Mahout 0.7 seems to keep the same clusters and not to modify
them between different iterations -
I get the same file names with the same sizes, and the process does not
seem to converge.
It looks to me like a bug, but it might be the planned behavior.
TIA,
Yuval