GitHub user sachingoel0101 reopened a pull request:

    https://github.com/apache/flink/pull/757

    [FLINK-2131][ml]: Initialization schemes for k-means clustering

    This adds two most common initialization strategies for the k-means 
clustering algorithm, namely, Random initialization and kmeans++ initialization.
    Further details are at https://issues.apache.org/jira/browse/FLINK-2131
    [Edit]: Work on kmeans|| has been started and just needs to be finalized.
    [Edit]: kmeans|| implementation finished. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachingoel0101/flink 
clustering_initializations

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/757.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #757
    
----
commit dc2de88bf5e3148bb116cad607fc3c61d9dceac6
Author: Sachin Goel <[email protected]>
Date:   2015-06-02T06:44:30Z

    Random and kmeans++ initialization methods added

commit 4a39a19c1425259c71ac6d922b4d9a9f2e7d1c6e
Author: Sachin Goel <[email protected]>
Date:   2015-06-02T15:42:58Z

    Merge https://github.com/apache/flink into clustering_initializations

commit cdbb3a0801d364935d455798c695f4615ae74e76
Author: Sachin Goel <[email protected]>
Date:   2015-06-02T19:49:24Z

    Merge https://github.com/apache/flink into clustering_initializations

commit 7496e21462e4efc0813450971ae6cbc94d2b2c15
Author: Sachin Goel <[email protected]>
Date:   2015-06-02T22:41:20Z

    Initialization costs of random and kmeans++ added

commit 8033c87b71686bd3955281db12583592549406cb
Author: Sachin Goel <[email protected]>
Date:   2015-06-05T21:54:10Z

    Merge https://github.com/apache/flink into clustering_initializations

commit 29ed1d3fb31aa038d6ed1a5bf16d58f19565cdf8
Author: Sachin Goel <[email protected]>
Date:   2015-06-05T22:52:02Z

    Removed cost parameter from Algorithm itself. Leaving it to the user for 
now. Also added support for weighted input data sets

commit 5286c3c21d5019f6ba8ab67c2074570087bc1b3a
Author: Sachin Goel <[email protected]>
Date:   2015-06-06T05:04:55Z

    An initial draft of kmeans-par method

commit f3bfad4fc0c6576af14f1e981f8e778445856355
Author: Sachin Goel <[email protected]>
Date:   2015-06-08T10:36:32Z

    All three initialization schemes implemented and tested

commit 8496b8fd627ade8dbe7b92949d35d3cce704f1cc
Author: Sachin Goel <[email protected]>
Date:   2015-06-08T10:36:58Z

    Merge https://github.com/apache/flink into clustering_initializations

commit 3765a3e6a77a8bdbac21d03be1c43263925b1495
Author: Sachin Goel <[email protected]>
Date:   2015-06-30T08:57:41Z

    Merge remote-tracking branch 'upstream/master' into 
clustering_initializations

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to