[ 
https://issues.apache.org/jira/browse/MADLIB-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1121:
------------------------------------
    Description: 
Follow on from  https://issues.apache.org/jira/browse/MADLIB-1072. Given that 
this story is complete, what measures can we compute from APSP?

Betweenness may use APSP result set, or implement as a separate algorithm, 
e.g., [1].

References

[1] A Faster Algorithm for Betweenness Centrality
http://www.algo.uni-konstanz.de/publications/b-fabc-01.pdf
(used in Gephi)

[2] Network properties
https://en.wikipedia.org/wiki/Network_science#Network_properties

Other comments

This (betweenness) might require a minor addition to the APSP code, length (the 
number of vertices) of the shortest path, that can be done in the same story.
Here is my idea for betweenness. Let's say APSP shows that the shortest path 
from 1 to 5 is 1,2,3,4,5. In this case the parent will be 4. This means we 
increment the betweenness count for 4 by 1 and set a multiplier value for the 
1->4 shortest path since every vertex in that path has to be counted one more 
time than usual. When we look at the path 1->4 and see the parent is 3, we 
increase its betweenness score by 2 and increase the multiplier for 1->3 path 
to 2 since those vertices should be counted for both of the previous iterations 
(1->4 and 1->5).
The algorithm should start from the path with the highest length and collect 
these values in max(|P|) iterations where |P| is the length of a given path. I 
am pretty sure that the reverse order is doable as well.
This implementation would be problematic on the HAWQ environment since it 
relied heavily on updates, so a completely independent algorithm might prove 
useful.


  was:
Follow on from  https://issues.apache.org/jira/browse/MADLIB-1072. Given that 
this story is complete, what measures can we compute from APSP?

Betweenness may use APSP result set, or implement as a separate algorithm, 
e.g., [1].

References

[1] A Faster Algorithm for Betweenness Centrality
http://www.algo.uni-konstanz.de/publications/b-fabc-01.pdf
(used in Gephi)

[2] Network properties
https://en.wikipedia.org/wiki/Network_science#Network_properties



> Graph - Betweenness centrality
> ------------------------------
>
>                 Key: MADLIB-1121
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1121
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Graph
>            Reporter: Frank McQuillan
>             Fix For: v1.12
>
>
> Follow on from  https://issues.apache.org/jira/browse/MADLIB-1072. Given that 
> this story is complete, what measures can we compute from APSP?
> Betweenness may use APSP result set, or implement as a separate algorithm, 
> e.g., [1].
> References
> [1] A Faster Algorithm for Betweenness Centrality
> http://www.algo.uni-konstanz.de/publications/b-fabc-01.pdf
> (used in Gephi)
> [2] Network properties
> https://en.wikipedia.org/wiki/Network_science#Network_properties
> Other comments
> This (betweenness) might require a minor addition to the APSP code, length 
> (the number of vertices) of the shortest path, that can be done in the same 
> story.
> Here is my idea for betweenness. Let's say APSP shows that the shortest path 
> from 1 to 5 is 1,2,3,4,5. In this case the parent will be 4. This means we 
> increment the betweenness count for 4 by 1 and set a multiplier value for the 
> 1->4 shortest path since every vertex in that path has to be counted one more 
> time than usual. When we look at the path 1->4 and see the parent is 3, we 
> increase its betweenness score by 2 and increase the multiplier for 1->3 path 
> to 2 since those vertices should be counted for both of the previous 
> iterations (1->4 and 1->5).
> The algorithm should start from the path with the highest length and collect 
> these values in max(|P|) iterations where |P| is the length of a given path. 
> I am pretty sure that the reverse order is doable as well.
> This implementation would be problematic on the HAWQ environment since it 
> relied heavily on updates, so a completely independent algorithm might prove 
> useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to