[
https://issues.apache.org/jira/browse/MADLIB-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147605#comment-16147605
]
Jingyi Mei commented on MADLIB-1124:
------------------------------------
To answer questions from [~fmcquillan]:
1. Threshold. Yes, it should be optional. Because we do normalization for both
authority and hub, the valid threshold range is \[0.1\]. Currently, there is no
strong evidence that we should use different thresholds for authority and hub
score, and we decide to pick (1/number of vertices * 1000) as default
threshold. Here is the new description for threshold:
{code}
FLOAT8, default: (1/number of vertices * 1000). If the difference between the
values of both scores (Authority and Hub) for every vertex of two consecutive
iterations is smaller than 'threshold', or the iteration number is larger than
'max_iter', the computation stops. If you set the threshold to zero, then you
will force the algorithm to run for the full number of iterations specified in
'max_iter'. Threshold need to be set to a value equal or less than 1 since both
values (Authority and Hub) of nodes are initialized as 1. Note that both
Authority and Hub value difference must be below threshold for the algorithm to
stop.
{code}
2. HITS doesn’t assign different ‘weight' or ‘importance' to different nodes,
so it shouldn’t rely on eigenvector centrality.
> Graph - HITS algorithm
> ----------------------
>
> Key: MADLIB-1124
> URL: https://issues.apache.org/jira/browse/MADLIB-1124
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Graph
> Reporter: Frank McQuillan
> Assignee: Jingyi Mei
> Fix For: v2.0
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)