[
https://issues.apache.org/jira/browse/SPARK-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502311#comment-14502311
]
Sean Owen commented on SPARK-7005:
----------------------------------
It's actually mentioned in the same wikipedia page:
http://en.wikipedia.org/wiki/PageRank#Damping_factor
Why is the result different, other than being scaled by N? each vertex starts
with PR 1, not 1/N, which you can also see in the scaladoc.
Maybe it's more conventional to have them sum to 1, I don't know, but it's not
really a bug; you can easily divide through if you need to.
> resetProb error in pagerank
> ---------------------------
>
> Key: SPARK-7005
> URL: https://issues.apache.org/jira/browse/SPARK-7005
> Project: Spark
> Issue Type: Bug
> Components: MLlib
> Affects Versions: 1.3.0
> Reporter: lisendong
> Labels: easyfix
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> in the page rank code, the resetProb should be divided by #vertex according
> to the wikipedia:
> http://en.wikipedia.org/wiki/PageRank
> that is:
> PR[i] = alpha / N + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
> but the code is (org.apache.spark.graphx.lib.PageRank)
> PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]