[
https://issues.apache.org/jira/browse/MADLIB-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006997#comment-16006997
]
Frank McQuillan commented on MADLIB-1071:
-----------------------------------------
[[email protected]]
could you kindly post the interface you are thinking about for this module?
Other questions that come to mind:
1) Do we know time complexity yet?
2) Are there any traversal algos that we can break out as separate module in
the course of doing this implementation? e.g., BFS?
3) Are there any helper functions that we should provide once we have a
connected components table? I guess these would be separate JIRAs but here are
a cpl examples:
* Set of all nodes which can be reached (have a path) from a specified vertex
* Are 2 specified vertices connected or not
* Count of connected components clusters
* Histogram of connected component sizes
* Get biggest component
* others?
> Graph - weakly connect components
> ---------------------------------
>
> Key: MADLIB-1071
> URL: https://issues.apache.org/jira/browse/MADLIB-1071
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Graph
> Reporter: Frank McQuillan
> Assignee: Rashmi Raghu
> Fix For: v1.12
>
>
> Story
> As a MADlib developer, I want to implement weakly connected components (ref
> [0]) in an efficient and scaleable way.
> Acceptance
> 1) Interface defined
> 2) Design document updated
> 3) Documentation and on-line help
> 4) IC and functional tests
> 5) Scale tests
> References
> [0] https://en.wikipedia.org/wiki/Connectivity_(graph_theory)
> "A directed graph is called weakly connected if replacing all of its directed
> edges with undirected edges produces a connected (undirected) graph."
> [1] Grails paper
> http://pages.cs.wisc.edu/~jignesh/publ/Grail.pdf
> [2] Grails deck
> http://pages.cs.wisc.edu/~jignesh/publ/Grail-slides.pdf
> [3] Grails repo with page rank example
> https://github.com/UWQuickstep/Grail
> https://github.com/UWQuickstep/Grail/blob/master/analytics/wcc.sql
> (weakly connected components implementation)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)