[ 
https://issues.apache.org/jira/browse/SPARK-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Dave resolved SPARK-3206.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0
         Assignee: Ankur Dave

> Error in PageRank values
> ------------------------
>
>                 Key: SPARK-3206
>                 URL: https://issues.apache.org/jira/browse/SPARK-3206
>             Project: Spark
>          Issue Type: Bug
>          Components: GraphX
>    Affects Versions: 1.0.2
>         Environment: UNIX with Hadoop
>            Reporter: Peter Fontana
>            Assignee: Ankur Dave
>             Fix For: 1.2.0
>
>
> I have found a small example where the PageRank values using run and 
> runUntilConvergence differ quite a bit.
> I am running the Pagerank module on the following graph:
> Edge Table:
> || Node1 || Node2 ||
> |1 | 2 |
> |1 |  3|
> |3 |  2|
> |3 |  4|
> |5 |  3|
> |6 |  7|
> |7 |  8|
> |8 |  9|
> |9 |  7|
> Node Table (note the extra node):
> || NodeID  || NodeName  ||
> |a |  1|
> |b |  2|
> |c |  3|
> |d |  4|
> |e |  5|
> |f |  6|
> |g |  7|
> |h |  8|
> |i |  9|
> |j.longaddress.com |  10|
> with a default resetProb of 0.15.
> When I compute the pageRank with runUntilConvergence, running 
> {{val ranks = PageRank.runUntilConvergence(graph,0.0001).vertices}}
> I get the ranks
> (4,0.29503124999999997)
> (1,0.15)
> (6,0.15)
> (3,0.34124999999999994)
> (7,1.3299054047985106)
> (9,1.2381240056453071)
> (8,1.2803346052504254)
> (10,0.15)
> (5,0.15)
> (2,0.35878124999999994)
> However, when I run page Rank with the run() method, running  
> {{val ranksI = PageRank.run(graph,100).vertices}} 
> I get the page ranks
> (4,0.29503124999999997)
> (1,0.15)
> (6,0.15)
> (3,0.34124999999999994)
> (7,0.9999999387662847)
> (9,0.9999999256447741)
> (8,0.9999999256447741)
> (10,0.15)
> (5,0.15)
> (2,0.29503124999999997)
> These are quite different, leading me to suspect that one of the PageRank 
> methods is incorrect. I have examined the source, but I do not know what the 
> correct fix is, or which set of values is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to