[
https://issues.apache.org/jira/browse/MAHOUT-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013983#comment-13013983
]
Gustavo Salazar Torres commented on MAHOUT-645:
-----------------------------------------------
Sean, I created the patch (hope I did right). Regarding whether committing it
as an optimization or not, Ted and Robert suggested to wait until having
further results. The intention is to implement the full Elkan optimization
which will modify almost entirely the current version of K-means.
> Elkan distance optimization for VectorBenchmarks class
> ------------------------------------------------------
>
> Key: MAHOUT-645
> URL: https://issues.apache.org/jira/browse/MAHOUT-645
> Project: Mahout
> Issue Type: Improvement
> Components: Clustering
> Affects Versions: 0.4
> Environment: Ubuntu Linux at Intel Core2 Duo P7450 @ 2.13GHz
> Reporter: Gustavo Salazar Torres
> Priority: Minor
> Labels: centroid, clustering, elkan
> Fix For: 0.4
>
> Attachments: patches.zip
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Implementation of first lemma of Elkan's optimization:
> Given three points x, b, c (where b and c are centroids):
> d(b,c)>=2d(x.b) then d(x,c)>=d(x,b)
> in which case we wouldn't need to calculate d(x,c). This is used to find the
> closest centroid for every point x.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira