[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307166#comment-15307166
 ] 

ASF GitHub Bot commented on FLINK-1707:
---------------------------------------

GitHub user joseprupi opened a pull request:

    https://github.com/apache/flink/pull/2053

    Affinity Propagation

    Hello,
    
    I have added an implementation of the Binary Affinity Propagation to Gelly. 
The Jira issue is:
    
    https://issues.apache.org/jira/browse/FLINK-1707
    
    This is not an implementation of the original Affinity Propagation and the 
detail of it can be found at:
    
    http://www.psi.toronto.edu/pubs2/2009/NC09%20-%20SimpleAP.pdf
    
    I have also added a simple example of it. To check this implementation I 
have compared the results with the scikit one for the original Affinity 
Propagation. Although the results are not exactly the same 
    they make sense to me, and I think properly parametrizing the epsilon in 
flink implementation and iterations in scikit we could get same result. I 
haven't found an implementation of this algorithm to compare the results.
    
    The example in scikit clusters stock prices. I've modified it to cluster 
more stocks than the original one. The results using scikit are:
    
    Cluster 1: Pepsi, Coca Cola, Kellogg
    Cluster 2: Navistar
    Cluster 3: Kimberly-Clark, Colgate-Palmolive, Procter Gamble, Kraft Foods
    Cluster 4: Wal-Mart
    Cluster 5: Comcast, Time Warner, Cablevision
    Cluster 6: ConocoPhillips, Apple, GlaxoSmithKline, Microsoft, SAP, Pfizer, 
Novartis, 3M, Sanofi-Aventis, IBM, Chevron, DuPont de Nemours, CVS, Total, 
Caterpillar, Home Depot, Valero Energy, Yahoo, Exxon, Mc Donalds, Cisco, 
Unilever
    Cluster 7: Ryder, Sony, Amazon, Marriott, Canon, Texas instruments, Ford, 
Toyota, Honda, HP, Mitsubishi, Xerox
    Cluster 8: American express, Goldman Sachs, General Electrics, Wells Fargo, 
Bank of America, AIG, JPMorgan Chase
    Cluster 9: Raytheon, Boeing, Walgreen, Lookheed Martin, General Dynamics, 
Northrop Grumman
    
    Using flink implementation:
    
    Cluster 1: CVS, Walgreen
    Cluster 2: American express, Goldman Sachs, General Electrics, Wells Fargo, 
Bank of America, AIG, JPMorgan Chase
    Cluster 3: Navistar
    Cluster 4: Kimberly-Clark, Colgate-Palmolive, Procter Gamble, Kellogg
    Cluster 5: Ryder, Sony, Marriott, Canon, Texas instruments, Ford, Toyota, 
Honda, HP, Mitsubishi, Xerox
    Cluster 6: Amazon, Yahoo
    Cluster 7: Pepsi, Coca Cola, Kraft Foods
    Cluster 8: Comcast, Time Warner, Cablevision
    Cluster 9: Raytheon, Boeing, Lookheed Martin, General Dynamics, Northrop 
Grumman
    Cluster 10: Wal-Mart
    Cluster 11: ConocoPhillips, Apple, GlaxoSmithKline, Microsoft, SAP, Pfizer, 
Novartis, 3M, Sanofi-Aventis, IBM, Chevron, DuPont de Nemours, Total, 
Caterpillar, Home Depot, Valero Energy, Exxon, Mc Donalds, Cisco, Unilever
    
    This results can be reproduced with  scikit branch in my flink repository 
and https://github.com/joseprupi/stockclusteringtest.
    
    As a future work in this implementation I guess the original algorithm 
could be implemented and maybe the Capacitated Affinity Propagation is 
mentioned in the paper.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/joseprupi/flink master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2053.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2053
    
----
commit d3b4cd98fcd07b5bba265baccbb465b6edc22e09
Author: Josep Rubio <[email protected]>
Date:   2016-05-31T02:31:41Z

    Affinity Propagation

----


> Add an Affinity Propagation Library Method
> ------------------------------------------
>
>                 Key: FLINK-1707
>                 URL: https://issues.apache.org/jira/browse/FLINK-1707
>             Project: Flink
>          Issue Type: New Feature
>          Components: Gelly
>            Reporter: Vasia Kalavri
>            Assignee: Josep Rubió
>            Priority: Minor
>              Labels: requires-design-doc
>         Attachments: Binary_Affinity_Propagation_in_Flink_design_doc.pdf
>
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf
> Design doc:
> https://docs.google.com/document/d/1QULalzPqMVICi8jRVs3S0n39pell2ZVc7RNemz_SGA4/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to