[ 
https://issues.apache.org/jira/browse/MATH-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058004#comment-17058004
 ] 

Chen Tao commented on MATH-1524:
--------------------------------

{quote}I don't get that; If the method is "private", users (like a GUI 
application) won't be able to call it..{quote}
I mean I only used getCenter() to predict which cluster is best for a new 
point, and now the "KMeansPlusPlusClusterer#getNearestCluster" is good for this 
purpose, but it is private(otherwise I do not need getCenter).
And as you said, a GUI application may use it, the clusters store and load may 
also need getCenter.

The Cluster.centroid() can not call it frequently on big dataset, it cannot be 
transport or used in a database, but getCenter() can.
So the Cluster.centroid() is useful in Clustering and Evaluator, but rarely 
useful for ML user application(like a GUI application), it can be package 
private(The origin method centroidOf is private)

> "chooseInitialCenters" should move out from KMeansPlusPlusClusterer
> -------------------------------------------------------------------
>
>                 Key: MATH-1524
>                 URL: https://issues.apache.org/jira/browse/MATH-1524
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Chen Tao
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two reason for "chooseInitialCenters" should be move out from 
> "KMeansPlusPlusClusterer":
> # k-means++ clusterer is a special case of k-means clusterer, that k-means++ 
> initialize the cluster centers with k-means++ algorithm. Another case is 
> initialize the cluster centers with random points.
> # The mini batch k-means will reuse "chooseInitialCenters". 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to