[ 
https://issues.apache.org/jira/browse/MATH-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058422#comment-17058422
 ] 

Chen Tao edited comment on MATH-1524 at 3/13/20, 5:02 AM:
----------------------------------------------------------

An usecase of "closest":
{code:java}
    // I am not sure where could put the closeIndexes properly, 
    <T extends Clusterable> int[] closestIndexes(List<T> list, DistanceMeasure 
dist);

    // ...
    // ++++++The training phase+++++++
    // Use k-means to generate a clusters.
    List<? extends CentroidCluster<? extend Clusterable>> clusters = ...
    // Extract centers, it is a little strange to let user to do it.
    List<? extends Clusterable> centers = 
clusters.stream().map(cluster->cluster.centroid()).collect(Collectors.toList());
    // -----------End of the training phase-----
    
    // +++++The predict phase++++++++
    Clusterable pointToPredict = ...
    int[] bestClusterIndexes = aUtilClass.closestIndexes(clusters, measure);
    Cluster bestCluster = clusters.get(bestClusterIndexes [0]);
    int[] closestPointsIndexes = 
aUtilClass.closestIndexes(clusters.getPoints());
    Clusterable closestPoints = Arrays.stream(closestPointsIndexes ).map(i -> 
bestCluster .getPoints().get(i));
    // Other logic like return more information of closestPoints
    // ...
    // ------------End of the predict phase------------
{code}



was (Author: chentao106):
An usecase of "closest":
{code:java}
    // I am not sure where could put the closeIndexes properly, 
    <T extends Clusterable> int[] closestIndexes(List<T> list, DistanceMeasure 
dist);

    // ...
    List<? extends CentroidCluster<? extend Clusterable>> clusters = ...
    List<? extends Clusterable> centers = 
clusters.stream().map(cluster->cluster.centroid()).collect(Collectors.toList());
    
    Clusterable pointToPredict = ...
    int[] bestClusterIndexes = aUtilClass.closestIndexes(clusters, measure);
    Cluster bestCluster = clusters.get(bestClusterIndexes [0]);
    int[] closestPointsIndexes = 
aUtilClass.closestIndexes(clusters.getPoints());
    Clusterable closestPoints = Arrays.stream(closestPointsIndexes ).map(i -> 
bestCluster .getPoints().get(i));
    // Other logic like return more information of closestPoints
    // ...
{code}


> "chooseInitialCenters" should move out from KMeansPlusPlusClusterer
> -------------------------------------------------------------------
>
>                 Key: MATH-1524
>                 URL: https://issues.apache.org/jira/browse/MATH-1524
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Chen Tao
>            Priority: Major
>         Attachments: centroid.png, getCenter.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two reason for "chooseInitialCenters" should be move out from 
> "KMeansPlusPlusClusterer":
> # k-means++ clusterer is a special case of k-means clusterer, that k-means++ 
> initialize the cluster centers with k-means++ algorithm. Another case is 
> initialize the cluster centers with random points.
> # The mini batch k-means will reuse "chooseInitialCenters". 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to