[
https://issues.apache.org/jira/browse/FLINK-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254230#comment-15254230
]
Greg Hogan commented on FLINK-2715:
-----------------------------------
The performance of {{TriangleEnumerator}} was considerably worse until the
recent fixes in FLINK-3770. This algorithm could also be updated to initially
order edges by lower degree rather than higher ID. It should also run faster
with the upcoming hashing combiner. The use of {{TreeMap}} likely limits the
performance relative to {{TriangleEnumerator}}.
Implementation of the Global Clustering Coefficient requires the triangle count
and I've been working on what I think will be a nice way to capture algorithm
metrics without duplicating code.
The Flink bug has been filed as FLINK-3805.
> Benchmark Triangle Count methods
> --------------------------------
>
> Key: FLINK-2715
> URL: https://issues.apache.org/jira/browse/FLINK-2715
> Project: Flink
> Issue Type: Task
> Components: Gelly
> Affects Versions: 0.10.0
> Reporter: Andra Lungu
> Priority: Minor
> Labels: starter
>
> Once FLINK-2714 is addressed, it would be nice to have a set of benchmarks
> that test the efficiency of the DataSet, GSA and vertex-centric versions.
> This means running the three examples on a cluster environment using various
> graph DataSets. For instance, SNAP's Orkut and Friendster networks
> (https://snap.stanford.edu/data/).
> The results produced by the experiments should then be reported in the Gelly
> docs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)