Jake,

The distance optimization was done in MAHOUT-121.
http://issues.apache.org/jira/browse/MAHOUT-121

The idea is described neatly on LingPipe blog
http://lingpipe-blog.com/2009/03/12/speeding-up-k-means-clustering-algebra-sparse-vectors/

I will go through the conversation between you and Ted, and chip in
wherever needed.

--shashi

On Wed, Jan 27, 2010 at 11:42 PM, Jake Mannix <jake.man...@gmail.com> wrote:
> The interface defines two methods:
>
>
>  double distance(Vector v1, Vector v2);
>  double distance(double centroidLengthSquare, Vector centroid, Vector v);
>
>
> With the latter being an optimized form of the former, and satisfies:
>
>  distance(v1, v2) == distance(v1.getLengthSquared(), v1, v2)
>
> Is this correct?  Every place I see this method called, it is used in this
> fashion, at least...
>
>  -jake
>

Reply via email to