Peter Fontana created SPARK-3206:
------------------------------------

             Summary: Error in PageRank values
                 Key: SPARK-3206
                 URL: https://issues.apache.org/jira/browse/SPARK-3206
             Project: Spark
          Issue Type: Bug
          Components: GraphX
    Affects Versions: 1.0.2
         Environment: UNIX with Hadoop
            Reporter: Peter Fontana


I have found a small example where the PageRank values using run and 
runUntilConvergence differ quite a bit.

I am running the Pagerank module on the following graph:

Edge Table:

| Node1  | Node2  |
| ------------- | ------------- |
1 | 2
1 |     3
3 |     2
3 |     4
5 |     3
6 |     7
7 |     8
8 |     9
9 |     7

Node Table (note the extra node):

| NodeID  | NodeName  |
| ------------- | ------------- |
a |     1
b |     2
c |     3
d |     4
e |     5
f |     6
g |     7
h |     8
i |     9
j.longaddress.com |     10

with a default resetProb of 0.15.
When I compute the pageRank with runUntilConvergence, running  val ranks = 
PageRank.runUntilConvergence(graph,0.0001).vertices

I get the ranks
(4,0.29503124999999997)
(1,0.15)
(6,0.15)
(3,0.34124999999999994)
(7,1.3299054047985106)
(9,1.2381240056453071)
(8,1.2803346052504254)
(10,0.15)
(5,0.15)
(2,0.35878124999999994)

However, when I run page Rank with the run() method, running  val ranksI = 
PageRank.run(graph,100).vertices I get the page ranks

(4,0.29503124999999997)
(1,0.15)
(6,0.15)
(3,0.34124999999999994)
(7,0.9999999387662847)
(9,0.9999999256447741)
(8,0.9999999256447741)
(10,0.15)
(5,0.15)
(2,0.29503124999999997)

These are quite different, leading me to suspect that one of the PageRank 
methods is incorrect. I have examined the source, but I do not know what the 
correct fix is, or which set of values is correct.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to