[ 
https://issues.apache.org/jira/browse/SPARK-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502311#comment-14502311
 ] 

Sean Owen commented on SPARK-7005:
----------------------------------

It's actually mentioned in the same wikipedia page: 
http://en.wikipedia.org/wiki/PageRank#Damping_factor
Why is the result different, other than being scaled by N? each vertex starts 
with PR 1, not 1/N, which you can also see in the scaladoc.

Maybe it's more conventional to have them sum to 1, I don't know, but it's not 
really a bug; you can easily divide through if you need to.

> resetProb error in pagerank
> ---------------------------
>
>                 Key: SPARK-7005
>                 URL: https://issues.apache.org/jira/browse/SPARK-7005
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.3.0
>            Reporter: lisendong
>              Labels: easyfix
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> in the page rank code, the resetProb should be divided by #vertex according 
> to the wikipedia:
> http://en.wikipedia.org/wiki/PageRank
> that is: 
> PR[i] = alpha / N + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
> but the code is (org.apache.spark.graphx.lib.PageRank)
> PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to