[ 
https://issues.apache.org/jira/browse/MADLIB-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147605#comment-16147605
 ] 

Jingyi Mei commented on MADLIB-1124:
------------------------------------

To answer questions from [~fmcquillan]:
1. Threshold. Yes, it should be optional. Because we do normalization for both 
authority and hub, the valid threshold range is \[0.1\]. Currently, there is no 
strong evidence that we should use different thresholds for authority and hub 
score, and we decide to pick (1/number of vertices * 1000) as default 
threshold. Here is the new description for threshold:
{code}
FLOAT8, default: (1/number of vertices * 1000). If the difference between the 
values of both scores (Authority and Hub) for every vertex of two consecutive 
iterations is smaller than 'threshold', or the iteration number is larger than 
'max_iter', the computation stops. If you set the threshold to zero, then you 
will force the algorithm to run for the full number of iterations specified in 
'max_iter'. Threshold need to be set to a value equal or less than 1 since both 
values (Authority and Hub) of nodes are initialized as 1. Note that both 
Authority and Hub value difference must be below threshold for the algorithm to 
stop. 
{code}
2. HITS doesn’t assign different ‘weight' or ‘importance' to different nodes, 
so it shouldn’t rely on eigenvector centrality.

> Graph - HITS algorithm
> ----------------------
>
>                 Key: MADLIB-1124
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1124
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Graph
>            Reporter: Frank McQuillan
>            Assignee: Jingyi Mei
>             Fix For: v2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to