[
https://issues.apache.org/jira/browse/FLINK-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291192#comment-15291192
]
ASF GitHub Bot commented on FLINK-3780:
---------------------------------------
Github user vasia commented on a diff in the pull request:
https://github.com/apache/flink/pull/1980#discussion_r63890500
--- Diff: docs/apis/batch/libs/gelly.md ---
@@ -2051,6 +2052,26 @@ The algorithm takes a directed, vertex (and possibly
edge) attributed graph as i
vertex represents a group of vertices and each edge represents a group of
edges from the input graph. Furthermore, each
vertex and edge in the output graph stores the common group value and the
number of represented elements.
+### Jaccard Index
+
+#### Overview
+The Jaccard Index measures the similarity between vertex neighborhoods.
Scores range from 0.0 (no common neighbors) to
+1.0 (all neighbors are common).
+
+#### Details
+Counting common neighbors for pairs of vertices is equivalent to counting
the two-paths consisting of two edges
--- End diff --
By "two-paths" you mean triads? i.e. open triangles?
> Jaccard Similarity
> ------------------
>
> Key: FLINK-3780
> URL: https://issues.apache.org/jira/browse/FLINK-3780
> Project: Flink
> Issue Type: New Feature
> Components: Gelly
> Affects Versions: 1.1.0
> Reporter: Greg Hogan
> Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> Implement a Jaccard Similarity algorithm computing all non-zero similarity
> scores. This algorithm is similar to {{TriangleListing}} but instead of
> joining two-paths against an edge list we count two-paths.
> {{flink-gelly-examples}} currently has {{JaccardSimilarityMeasure}} which
> relies on {{Graph.getTriplets()}} so only computes similarity scores for
> neighbors but not neighbors-of-neighbors.
> This algorithm is easily modified for other similarity scores such as
> Adamic-Adar similarity where the sum of endpoint degrees is replaced by the
> degree of the middle vertex.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)