[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

ASF GitHub Bot (JIRA) Tue, 30 Jun 2015 01:52:57 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14607968#comment-14607968
 ]


ASF GitHub Bot commented on FLINK-1731:
---------------------------------------

Github user sachingoel0101 commented on the pull request:

    https://github.com/apache/flink/pull/700#issuecomment-117060737
  
    Hi. IMO, the purpose of learning is to develop a model which compactly 
represents the data somehow. Thus, having a distributed model doesn't make 
sense. Besides, the user might just want to take the model and use it somewhere 
else in which case it makes sense to have it available not-as-distributed, but 
just as a java slash scala object which user can easily operate on.


> Add kMeans clustering algorithm to machine learning library
> -----------------------------------------------------------
>
>                 Key: FLINK-1731
>                 URL: https://issues.apache.org/jira/browse/FLINK-1731
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Peter Schrott
>              Labels: ML
>
> The Flink repository already contains a kMeans implementation but it is not 
> yet ported to the machine learning library. I assume that only the used data 
> types have to be adapted and then it can be more or less directly moved to 
> flink-ml.
> The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
> implementation because the improve the initial seeding phase to achieve near 
> optimal clustering. It might be worthwhile to implement kMeans||.
> Resources:
> [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
> [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

Reply via email to