On Jul 27, 2009, at 12:03 PM, Ted Dunning wrote:

Yes.

That explains why Jeff didn't see the slow down with dense vectors.

Not following. The distance calc stuff is irrespective of the type of Vector. I was referring to the centroid length square (I think you called it the triangle inequality) stuff that Shashikant added on MAHOUT-121. We use it for testing convergence, but not for other distance calculations. I haven't looked to see if it is applicable yet, but it seems like it should be.


On Mon, Jul 27, 2009 at 8:03 AM, Grant Ingersoll <[email protected]>wrote:

Hmm, some profiling shows the pain is in the distance calculation for
emitPointToNearestCluster. Seems that we only use the optimized distance calculations for testing convergence, but shouldn't we also use it for
calculating the distances to the cluster, too?




--
Ted Dunning, CTO
DeepDyve

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to