[ 
https://issues.apache.org/jira/browse/MADLIB-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102439#comment-16102439
 ] 

ASF GitHub Bot commented on MADLIB-1101:
----------------------------------------

GitHub user njayaram2 opened a pull request:

    https://github.com/apache/incubator-madlib/pull/155

    Feature: Weakly connected components helper functions

    JIRA: MADLIB-1101
    
    Add several helper functions that will quickly return back various
    useful stats based on the connected components learng from the
    madlib.weakly_connected_components() function. Five helper functions
    are added as part of this story, along with docs and updated install
    check. The helper functions are:
    - graph_wcc_largest_cpt(): finds largest components
    - graph_wcc_histogram(): finds number of vertices in each component
    - graph_wcc_vertex_check(): finds all components that have a given
    pair of vertices in them.
    - graph_wcc_num_cpts(): finds total number of components.
    - graph_wcc_reachable_vertices(): finds all vertices reachable
    within a component for a given source vertex.
    
    All these functions are implemented to handle grouping columns too
    if the WCC's output table was created with grouping_cols.
    
    Closes #155

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/njayaram2/incubator-madlib features/wcc_helper

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/155.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #155
    
----
commit 85e89ef1857ed432f295991a6037aa5732714911
Author: Nandish Jayaram <[email protected]>
Date:   2017-07-18T16:31:09Z

    Feature: Weakly connected components helper functions
    
    JIRA: MADLIB-1101
    
    Add several helper functions that will quickly return back various
    useful stats based on the connected components learng from the
    madlib.weakly_connected_components() function. Five helper functions
    are added as part of this story, along with docs and updated install
    check. The helper functions are:
    - graph_wcc_largest_cpt(): finds largest components
    - graph_wcc_histogram(): finds number of vertices in each component
    - graph_wcc_vertex_check(): finds all components that have a given
    pair of vertices in them.
    - graph_wcc_num_cpts(): finds total number of components.
    - graph_wcc_reachable_vertices(): finds all vertices reachable
    within a component for a given source vertex.
    
    All these functions are implemented to handle grouping columns too
    if the WCC's output table was created with grouping_cols.
    
    Closes #155

----


> Graph - weakly connected components helper functions
> ----------------------------------------------------
>
>                 Key: MADLIB-1101
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1101
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Graph
>            Reporter: Frank McQuillan
>             Fix For: v1.12
>
>
> Context 
> Follow on from 
> https://issues.apache.org/jira/browse/MADLIB-1071
> Story
> As a data scientist, I want to use helper functions related to weakly 
> connected components, so that I don't have to query the result table myself 
> which is less efficient and subject to error.
> List of helper functions roughly in priority order:
> 1) biggest connected component
> 2) number of nodes per connected component (histogram)
> 3) whether two nodes belong to same or different connected components
> 4) count of connected cpt clusters
> 5) Set of all nodes which can be reached (have a path) from a specified vertex



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to