[
https://issues.apache.org/jira/browse/FLINK-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622039#comment-14622039
]
ASF GitHub Bot commented on FLINK-2310:
---------------------------------------
Github user vasia commented on the pull request:
https://github.com/apache/flink/pull/892#issuecomment-120319905
Hi @shghatge!
I agree, let's deal with the approximate version as a separate issue. In
the end though, it would be nice to have a single library method and an input
parameter to decide whether the computation should be exact or approximate.
Regarding the bloom filter, the idea is for each vertex to build a bloom
filter with its neighbors and "send" it to its neighbors. Then, each vertex can
compare its own neighborhood (the exact one) with the received bloom filter
neighborhoods. Take a look at how approximate Jaccard is computed in the okapi
library
[here](https://github.com/grafos-ml/okapi/blob/master/src/main/java/ml/grafos/okapi/graphs/similarity/Jaccard.java)
(class `JaccardApproximation `).
Let me know if you have more questions :)
> Add an Adamic-Adar Similarity example
> -------------------------------------
>
> Key: FLINK-2310
> URL: https://issues.apache.org/jira/browse/FLINK-2310
> Project: Flink
> Issue Type: Task
> Components: Gelly
> Reporter: Andra Lungu
> Assignee: Shivani Ghatge
> Priority: Minor
>
> Just as Jaccard, the Adamic-Adar algorithm measures the similarity between a
> set of nodes. However, instead of counting the common neighbors and dividing
> them by the total number of neighbors, the similarity is weighted according
> to the vertex degrees. In particular, it's equal to log(1/numberOfEdges).
> The Adamic-Adar algorithm can be broken into three steps:
> 1). For each vertex, compute the log of its inverse degrees (with the formula
> above) and set it as the vertex value.
> 2). Each vertex will then send this new computed value along with a list of
> neighbors to the targets of its out-edges
> 3). Weigh the edges with the Adamic-Adar index: Sum over n from CN of
> log(1/k_n)(CN is the set of all common neighbors of two vertices x, y. k_n is
> the degree of node n). See [2]
> Prerequisites:
> - Full understanding of the Jaccard Similarity Measure algorithm
> - Reading the associated literature:
> [1] http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf
> [2]
> http://stackoverflow.com/questions/22565620/fast-algorithm-to-compute-adamic-adar
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)