Github user vasia commented on the pull request:
https://github.com/apache/flink/pull/1105#issuecomment-139958798
Hi,
I agree with @fhueske that we should consider porting the existing DataSet
example to the Gelly library. The algorithm is a bit different and it's not
very clear (at least to me) which one would be the most efficient. The DataSet
implementation applies the optimization that grouping happens on vertices with
low degrees. The one we currently have in Gelly (and this one) applies the
optimization that the vertex with the lowest ID will be the one detecting the
triangle.
@andralungu, we have chosen these implementations for your thesis in order
to test a specific method and prove a point. This doesn't necessarily mean that
this is the most efficient way to implement triangle counting. Ideally, we
should analyze the complexity of each implementation and run experiments to
determine which one is best to add to the library.
Also, we should definitely support any `Key` type and value types and we
should update existing library methods that assume a particular type for no
good reason.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---