[ 
https://issues.apache.org/jira/browse/MADLIB-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166695#comment-16166695
 ] 

Jingyi Mei commented on MADLIB-1124:
------------------------------------

Question: If it is a multi-directed graph, there might be some duplicated 
edges.  how are we going to deal with those duplicated edges for HITS? 

The current implementation doesn’t check if an edge is distinct or not, which 
means, if there are multiple edges from one vertex to another, those edges will 
be counted multiple times. An example can be like this: in paper A, there are 
multiple links refer to paper B, and using our current calculation, when 
calculating B’s authority score, A’s hub score will be added multiple times. In 
this case, A will play a 'more important' role than other vertices which only 
have one edge pointing to B. Does this make sense? Should we treat every vertex 
equally so that we only calculate distinct edges between vertices?

> Graph - HITS algorithm
> ----------------------
>
>                 Key: MADLIB-1124
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1124
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Graph
>            Reporter: Frank McQuillan
>            Assignee: Jingyi Mei
>             Fix For: v2.0
>
>         Attachments: pagerank_hits.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to