[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602505#comment-15602505
 ] 

ASF GitHub Bot commented on FLINK-4204:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2670


> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
> Fix For: 1.2.0
>
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595767#comment-15595767
 ] 

ASF GitHub Bot commented on FLINK-4204:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2670
  
I pushed a commit to remove the `GraphMetrics` example.

I think providing drivers for all library methods is both desirable and 
ambitious. If we like the form and functionality of the current drivers then 
I'd like to look at consolidating common functionality where possible. We may 
also be able to put multiple similar algorithms like `JaccardIndex` / 
`AdamicAdar` / `CommonNeighbors` into the same driver.

I had first removed `TriangleListing` as it's not an algorithm but I added 
it back due to Facebook's recent benchmarking: 
https://code.facebook.com/posts/319004238457019/a-comparison-of-state-of-the-art-graph-processing-systems


> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595656#comment-15595656
 ] 

ASF GitHub Bot commented on FLINK-4204:
---

Github user vasia commented on the issue:

https://github.com/apache/flink/pull/2670
  
Hi @greghogan,
I really like the cleanup and new organization!
Two thoughts:
- is the plan to add drivers for all library methods?
- shall we remove the `GraphMetrics` example since there is a better driver?


> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593069#comment-15593069
 ] 

ASF GitHub Bot commented on FLINK-4204:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/2670

[FLINK-4204] [gelly] Clean up gelly-examples

Moves drivers into separate package. Adds default main class to print usage 
listing included classes. Includes documentation for running Gelly examples.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 4204_clean_up_gelly_examples

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2670.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2670


commit 642267c70f362ce5414838aaddbed0dcd6b60934
Author: Greg Hogan 
Date:   2016-08-24T15:32:43Z

[FLINK-4204] [gelly] Clean up gelly-examples

Moves drivers into separate package. Adds default main class to print
usage listing included classes. Includes documentation for running
Gelly examples.




> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-07-14 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376917#comment-15376917
 ] 

Greg Hogan commented on FLINK-4204:
---

Like Steve Jobs' iPods, I think Gelly should ship with charged batteries 
[https://www.ted.com/talks/tony_fadell_the_first_secret_of_design_is_noticing/transcript].
 We should make it as easy as possible for new users (not necessarily 
developers) to run algorithms on data and to perceive the power of Flink.

We can also continue to refactor and condense the drivers to reduce the lines 
of code. I hadn't done this because 1) we're still settling on the standard 
functionality and 2) there is more functionality to be added, such as edge 
weights.

I do question, is TriangleListing a useful standalone algorithm? For counting 
triangles ClusteringCoefficient can be used.

> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-07-14 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376834#comment-15376834
 ] 

Vasia Kalavri commented on FLINK-4204:
--

Separating drivers and examples sounds like a good idea. Do you think we should 
add drivers for every library algorithm? Isn't it enough to provide 1-2 
examples and have good documentation about how users can write their own 
drivers?

> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-07-13 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374877#comment-15374877
 ] 

Greg Hogan commented on FLINK-4204:
---

We should also add a {{Main-Class}} to {{flink-gelly-examples}} to print usage 
for running the drivers and example programs. Currently the only means to 
discover the available classes are to read the source or list classes in the 
jar.

> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-07-12 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374064#comment-15374064
 ] 

Greg Hogan commented on FLINK-4204:
---

I think there is a strong case for providing both 1) drivers and 2) examples. 
The drivers are a nice way to kick the tires, to run the algorithms on actual 
data, and as examples for using the library methods. The example algorithms, as 
you note, illustrate the APIs.

The {{provided}} scoping that was discussed in February forced the executable 
code into the separate examples module.

I think it would be helpful to namespace the drivers into 
{{o.a.f.graph.examples.driver}}. Also, to provide some documentation under 
"Using Gelly" for running a job.

It's nice to consolidate algorithms where possible, for example 
ClusteringCoefficient performs both local and global for directed and 
undirected.

I like seeing three variants of, for example, SSSP as the comparison makes a 
useful example. I'd prefer to clean these up a little so that the examples 
demonstrate performant code and out-of-the-box can run on a large data set.

> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4204) Clean up gelly-examples

2016-07-12 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373767#comment-15373767
 ] 

Vasia Kalavri commented on FLINK-4204:
--

[~greghogan] let me know what you think!

> Clean up gelly-examples
> ---
>
> Key: FLINK-4204
> URL: https://issues.apache.org/jira/browse/FLINK-4204
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> The gelly-examples has grown quite big (14 examples) and contains several 
> examples that illustrate the same functionality. Examples should help users 
> understand how to use the API and ideally show how to use 1-2 features.
> Also, it is helpful to state the purpose of each example in the comments.
> We should keep the example set small and move everything that does not fit 
> there to the library.
> I propose to remove the following:
> - ClusteringCoefficient: the functionality already exists as a library method.
> - HITS: the functionality already exists as a library method.
> - JaccardIndex: the functionality already exists as a library method.
> - SingleSourceShortestPaths: the example shows how to use scatter-gather 
> iterations. HITSAlgorithm shows the same feature plus the use of aggregators. 
> I propose we keep this one instead.
> - TriangleListing: the functionality already exists as a library method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)