[
https://issues.apache.org/jira/browse/FLINK-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293429#comment-15293429
]
ASF GitHub Bot commented on FLINK-3780:
---------------------------------------
Github user vasia commented on a diff in the pull request:
https://github.com/apache/flink/pull/1980#discussion_r64048334
--- Diff: docs/apis/batch/libs/gelly.md ---
@@ -2250,14 +2250,33 @@ graph.run(new TranslateVertexValues(new
LongValueAddOffset(vertexCount)));
</tr>
<tr>
- <td>translate.<br/><strong>TranslateEdgeValues</strong></td>
+ <td>asm.translate.<br/><strong>TranslateEdgeValues</strong></td>
<td>
<p>Translate edge values using the given
<code>TranslateFunction</code>.</p>
{% highlight java %}
graph.run(new TranslateEdgeValues(new Nullify()));
{% endhighlight %}
</td>
</tr>
+
+ <tr>
+ <td>library.similarity.<br/><strong>JaccardIndex</strong></td>
+ <td>
+ <p>Measures the similarity between vertex neighborhoods. The
Jaccard Index score is computed as the number of shared numbers divided by the
number of distinct neighbors. Scores range from 0.0 (no shared neighbors) to
1.0 (all neighbors are shared).</p>
--- End diff --
Why did you add this here and not in the "Usage" section of the library
method?
I find it a bit confusing... You describe graph algorithms as building
blocks for other algorithms. Does Jaccard index fall in this category?
> Jaccard Similarity
> ------------------
>
> Key: FLINK-3780
> URL: https://issues.apache.org/jira/browse/FLINK-3780
> Project: Flink
> Issue Type: New Feature
> Components: Gelly
> Affects Versions: 1.1.0
> Reporter: Greg Hogan
> Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> Implement a Jaccard Similarity algorithm computing all non-zero similarity
> scores. This algorithm is similar to {{TriangleListing}} but instead of
> joining two-paths against an edge list we count two-paths.
> {{flink-gelly-examples}} currently has {{JaccardSimilarityMeasure}} which
> relies on {{Graph.getTriplets()}} so only computes similarity scores for
> neighbors but not neighbors-of-neighbors.
> This algorithm is easily modified for other similarity scores such as
> Adamic-Adar similarity where the sum of endpoint degrees is replaced by the
> degree of the middle vertex.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)