[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-20 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/770#issuecomment-113752297
  
Meging...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/770


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-14 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/770#issuecomment-111807874
  
PR updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-13 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/770#discussion_r32374939
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java
 ---
@@ -66,34 +63,47 @@ public static void main(String [] args) throws 
Exception {
 
DataSetEdgeLong, Double edges = getEdgesDataSet(env);
 
-   GraphLong, NullValue, Double graph = Graph.fromDataSet(edges, 
env);
+   GraphLong, HashSetLong, Double graph = 
Graph.fromDataSet(edges,
+   new MapFunctionLong, HashSetLong() {
 
-   DataSetVertexLong, HashSetLong verticesWithNeighbors =
-   graph.groupReduceOnEdges(new GatherNeighbors(), 
EdgeDirection.ALL);
+   @Override
+   public HashSetLong map(Long id) 
throws Exception {
+   HashSetLong neighbors = new 
HashSetLong();
+   neighbors.add(id);
 
-   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
+   return new 
HashSetLong(neighbors);
+   }
+   }, env);
 
-   // the edge value will be the Jaccard similarity 
coefficient(number of common neighbors/ all neighbors)
-   DataSetTuple3Long, Long, Double edgesWithJaccardWeight = 
graphWithVertexValues.getTriplets()
-   .map(new WeighEdgesMapper());
+   // create the set of neighbors
+   DataSetTuple2Long, HashSetLong computedNeighbors =
+   graph.reduceOnNeighbors(new GatherNeighbors(), 
EdgeDirection.ALL);
 
-   DataSetEdgeLong, Double result = 
graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight,
-   new MapFunctionTuple2Double, Double, 
Double() {
+   // join with the vertices to update the node values
+   DataSetVertexLong, HashSetLong verticesWithNeighbors =
+   graph.joinWithVertices(computedNeighbors, new 
MapFunctionTuple2HashSetLong, HashSetLong,
+   HashSetLong() {
 
@Override
-   public Double map(Tuple2Double, 
Double value) throws Exception {
-   return value.f1;
+   public HashSetLong 
map(Tuple2HashSetLong, HashSetLong tuple2) throws Exception {
+   return tuple2.f1;
}
-   }).getEdges();
+   }).getVertices();
+
+   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
--- End diff --

joinWithVertices can give you the Graph directly :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-03 Thread andralungu
GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/770

[FLINK-2149][gelly] Simplified Jaccard Example

This PR simplifies Gelly's Jaccard example by using the more efficient 
reduceOnNeighbors rather than groupReduceOnNeighbors. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink jaccardImprovement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #770


commit 0e189c6af9a5fb80b4999a60a431d60cf95944db
Author: andralungu lungu.an...@gmail.com
Date:   2015-06-03T14:12:16Z

[FLINK-2149][gelly] Simplified Jaccard Example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---