Github user vasia commented on the pull request:
https://github.com/apache/flink/pull/1054#issuecomment-134604410
Thanks for adding this one @andralungu. We really need this algorithm in
the library!
I have a few comments:
- we should clarify what is the expected input graph format and the output.
If I'm not mistaken, it seems you're expecting a directed graph without edge
duplicates and you count triangles ignoring edge direction. Is that correct? I
would add a clear comment in the usage description about that. If we want to do
this even better, we could even add a graph validator for the input.
- I'm not quite sure what happens when the graph has opposite direction
edges, i.e. a->b and b->a, that are both part of a triangle. I would expect
that this triangle would be counted twice, but it seems to me that you're only
counting it once. Is there a reason for that?
- as you've been experimenting with this for a while, could you let us know
how better is this than your vertex-centric version? Is it always the case? If
not, do you think it would make sense to add both implementations in the
library and let the users choose?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---