[
https://issues.apache.org/jira/browse/FLINK-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499837#comment-14499837
]
ASF GitHub Bot commented on FLINK-1758:
---------------------------------------
Github user vasia commented on a diff in the pull request:
https://github.com/apache/flink/pull/576#discussion_r28594064
--- Diff: docs/gelly_guide.md ---
@@ -269,7 +269,15 @@ Neighborhood Methods
Neighborhood methods allow vertices to perform an aggregation on their
first-hop neighborhood.
-`reduceOnEdges()` can be used to compute an aggregation on the neighboring
edges of a vertex, while `reduceOnNeighbors()` has access on both the
neighboring edges and vertices. The neighborhood scope is defined by the
`EdgeDirection` parameter, which takes the values `IN`, `OUT` or `ALL`. `IN`
will gather all in-coming edges (neighbors) of a vertex, `OUT` will gather all
out-going edges (neighbors), while `ALL` will gather all edges (neighbors).
+`groupReduceOnEdges()` can be used to compute an aggregation on the
neighboring edges of a vertex,
+while `groupReduceOnNeighbors()` has access to both the neighboring edges
and vertices. The neighborhood scope
+is defined by the `EdgeDirection` parameter, which takes the values `IN`,
`OUT` or `ALL`. `IN` will gather all in-coming edges (neighbors) of a vertex,
`OUT` will gather all out-going edges (neighbors), while `ALL` will gather all
edges (neighbors).
+
+The `groupReduceOnEdges()` and `groupReduceOnNeighbors()` methods return
zero, one or more values per vertex.
+When returning a single value per vertex, `reduceOnEdges()` or
`reduceOnNeighbors()` should be called
+as they are more efficient. Nevertheless, when the reduce on edges
modifies the value produced per vertex, for
+instance by multiplying it with a constant, `groupReduceOnEdges()` or
`groupReduceOnNeighbors()` must be used
+as illustrated in the third code snippet.
--- End diff --
I would rephrase this into something like the following: "when the
user-defined function to be applied on the neighborhood is associative and
commutative, it is highly advised to use the `reduceOnEdges()` and
`reduceOnNeighbors()` methods. These methods can exploit combiners internally
and significantly improve performance".
> Extend Gelly's neighborhood methods
> -----------------------------------
>
> Key: FLINK-1758
> URL: https://issues.apache.org/jira/browse/FLINK-1758
> Project: Flink
> Issue Type: Improvement
> Components: Gelly
> Affects Versions: 0.9
> Reporter: Vasia Kalavri
> Assignee: Andra Lungu
>
> Currently, the neighborhood methods only allow returning a single value per
> vertex. In many cases, it is desirable to return several or no value per
> vertex. This is the case in clustering coefficient computation,
> vertex-centric jaccard, algorithms where a vertex computes a value per edge
> or when a vertex computes a value only for some of its neighbors.
> This issue proposes to
> - change the current reduceOnEdges/reduceOnNeighbors methods to use
> combinable reduce operations where possible
> - provide groupReduce-versions, which will use a Collector and allow
> returning none or more values per vertex.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)