Github user luyi0619 commented on the pull request:
https://github.com/apache/spark/pull/1128#issuecomment-47226594
Hi ankurdave,
In the graph you given(1-->2), I think your pagerank implementation will
give the same results.
when the initalvalue is 0.0
a b
0 0
0.15 0.15
0.15 0.2775
when the initalvalue is 1.0
1 1
0.15 1
0.15 0.2775
Actually, there is another problem maybe we should handle if there is
0-outdegree vertices in the graph(e.g. 2). One solution is to distribute their
pagerank values to all the vertices uniformly, but the initialvalue should be
one.
For example,
a b
1 1
0.575 1.425
0.755625 1.24438
0.678859 1.32114
0.711485 1.28852
-----------------------
Omit some iterations.
-----------------------
0.701756 1.29824
0.701754 1.29825
0.701755 1.29825
In my understanding, if there is no 0-outdegree vertices in a graph, the
sum of pagerank values from all vertices should remain the same. However, in
the graphX implementation, this is not true.
In a word, I still think the initialvalue should be 1.0, you could also
refer to graphlab's implementation.
52th line of
https://github.com/graphlab-code/graphlab/blob/master/toolkits/graph_analytics/pagerank.cpp
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---