On Jul 27, 2009, at 12:55 PM, Shashikant Kore wrote:

On Mon, Jul 27, 2009 at 10:11 PM, Grant Ingersoll<[email protected]> wrote:

Not following. The distance calc stuff is irrespective of the type of Vector. I was referring to the centroid length square (I think you called it the triangle inequality) stuff that Shashikant added on MAHOUT-121. We use it for testing convergence, but not for other distance calculations. I haven't looked to see if it is applicable yet, but it seems like it should
be.


Grant,

Yes, that part of the patch is missing.  In my original patch, I had
modified the  emitPointToNearestCluster() in kmeans/Cluster.java to
calculate distance between document and centroids of various clusters.
(There is no triangle inequality code, though.)  In the later patches
I don't see that code.

I had reviewed the final patch, but I missed out on this one.  I
think, I only ran Canopy and not K-means. Incidentally, I am
hopelessly out of date with trunk as recently I have not worked on
this.  BTW, I haven't really followed this thread in depth. So, I
might be speaking out of context here. Apologies.

I'll be on a plane tomorrow, will see if I can track down the differences.

-Grant

Reply via email to