[
https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gilles Sadowski resolved MATH-1371.
-----------------------------------
Fix Version/s: 4.0
(was: 4.X)
Resolution: Implemented
Code added (with many necessary changes) in commit
74a851b611bf6db1c6177217f1a88b71352e3faf (in "master" branch).
> Provide accelerated kmeans++ implementation
> -------------------------------------------
>
> Key: MATH-1371
> URL: https://issues.apache.org/jira/browse/MATH-1371
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Artem Barger
> Assignee: Artem Barger
> Priority: Major
> Fix For: 4.0
>
> Attachments: ElkanKmeansPlusPlusClusterer.java,
> ElkanKmeansPlusPlusClustererTest.java
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> There is an updated version of kmeans++ algorithm available, which is
> published in: Elkan, Charles. "Using the triangle inequality to accelerate
> k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of
> distances between centers and points when there is no need for that. For
> example after the update cluster center haven't moved too far from the point
> therefore no change in point assignment. The accelerated algorithm avoids
> unnecessary distance calculations by applying the triangle inequality in two
> different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)