Thank you very much for these, they provide great insight into what went into 
the decision making when they were developed.

When I have the time, I'll look into which of the approaches seem to be more 
favored in the literature (in terms of the vector to use when teleporting from 
a dangling node).

Thanks a lot for the help,
Best,
Omer


________________________________
From: Tamas Nepusz <nta...@gmail.com>
Sent: Tuesday, January 28, 2020 9:34 AM
To: Yalcin, Omer Faruk <o...@psu.edu>
Cc: Help for igraph users <igraph-help@nongnu.org>
Subject: Re: [igraph] personalized pagerank computation issue

Hi Omer,

Your comment rang a bell -- I remembered that this issue has already popped up 
back in 2014 when we first switched from ARPACK to PRPACK; see the following 
issue in the issue tracker:

https://github.com/igraph/igraph/issues/671<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Figraph%2Figraph%2Fissues%2F671&data=02%7C01%7Coxy4%40psu.edu%7Cfebfb4e0d71b41507bab08d7a3ff442d%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C637158189070639430&sdata=plMoFjjhOflFb9r3YhcmXSjmUQXcn5LJLk2yzfqGWt0%3D&reserved=0>

Memories of mine about PRPACK were much fresher by then as I wrote the 
following:

  *   with the ARPACK implementation, the random walker teleports according to 
the distribution of the reset vector when it encounters a dangling node
  *   with the PRPACK implementation, the random walker teleports according to 
a uniform distribution by default.

After reading the thread, it also became apparent that PRPACK is doing some 
trickery with sink nodes; the following is from @dgleich, one of the original 
authors of PRPACK:

"For the case that you are jumping to nodes that have no outgoing edges, what 
happens is you add new edges according to the teleportation set/reset set or 
the special set "u"."

So, all in all, the case is probably that your implementation matches the 
ARPACK implementation of PRPACK from igraph, and that's why you are seeing 
identical results (both your code and igraph's ARPACK implementation formulates 
PageRank as an eigenvector problem and solves that). In case of PRPACK, 
PageRank is not an eigenvector problem any more; again, quoting @dgleich:

"PRPACK does decompose the graph into SCCs, but the primary advantage is that 
it frames the PageRank problem as a linear system instead of the eigenvalue 
problem. This has tremendous numerical advantages."

There was also a PR quite a while ago that attempted to introduce the 
possibility of specifying the personalization vector and the reset vector 
separately, but it did not get merged in the end:

https://github.com/igraph/igraph/pull/673<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Figraph%2Figraph%2Fpull%2F673&data=02%7C01%7Coxy4%40psu.edu%7Cfebfb4e0d71b41507bab08d7a3ff442d%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C637158189070649379&sdata=YhfCOmgyfiOdyb3QuD7W20SMZK%2B3871omzzbde3I%2Foo%3D&reserved=0>

Another issue where this question popped up -- maybe it also provides more 
insight:

https://github.com/igraph/igraph/issues/1211<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Figraph%2Figraph%2Fissues%2F1211&data=02%7C01%7Coxy4%40psu.edu%7Cfebfb4e0d71b41507bab08d7a3ff442d%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C637158189070649379&sdata=moxDpNr%2B8OHyoK3U0%2BMaP9hMJtcEp%2BrtZUwz%2FbtVPTU%3D&reserved=0>

Best,
T.


On Mon, 27 Jan 2020 at 23:17, Yalcin, Omer Faruk 
<o...@psu.edu<mailto:o...@psu.edu>> wrote:
Thank you very much for tracking the code! Unfortunately, that doesn't work 
either. I am also fairly certain that allowing the node to stay at the same 
spot would give that node an unwarranted boost in pagerank, so it is probably 
undesirable.

I do have an interesting result though; when I use "arpack" instead of the 
default "prpack", I get the exact same results as my custom written function. 
In other words, when there are nodes that have no outgoing edges, "prpack" and 
"arpack" do the computation differently.

My problem seems to be solved (as long as there is no reason why "arpack" is 
wrong) but this difference between the two algorithms might be of interest to 
you.

Thank you very much.
Omer
________________________________
From: Tamas Nepusz <nta...@gmail.com<mailto:nta...@gmail.com>>
Sent: Monday, January 27, 2020 4:15 PM
To: Yalcin, Omer Faruk <o...@psu.edu<mailto:o...@psu.edu>>
Cc: Help for igraph users 
<igraph-help@nongnu.org<mailto:igraph-help@nongnu.org>>
Subject: Re: [igraph] personalized pagerank computation issue


That being said, after your question, I set the probability of navigating to 
other nodes from a node that has no outbound links to the personalization 
vector. That doesn't reproduce the igraph result either.
There's also a third option: if there are no outbound nodes, stay at the same 
node with probability equal to 1-damping, _or_ navigate to a randomly picked 
node accoding to the persionalization vector with probability equal to damping. 
Sorry for not being too precise here; the thing is that igraph is using an 
external library (PRPACK) to calculate personalized PageRank scores, and I only 
managed to track the code to a point where I am convinced that we are passing 
down two  vectors to PRPACK; one is a uniform vector, and the other one is the 
personalization vector submitted by the user. Based on this, I would assume 
that PRPACK uses the personalization vector when the random walk is reset, and 
the uniform vector for a random teleport (after all, why would PRPACK need two 
vectors if it used the personalization vector for both cases?), but I did not 
manage to track it down further because PRPACK contains at least six different 
solvers, optimized for different use-cases, and I did not manage to figure out 
which one it would use in your particular case. But I'm pretty sure that the 
discrepancy between your results and ours is due to some corner case in the 
handling of sink nodes.

T.
_______________________________________________
igraph-help mailing list
igraph-help@nongnu.org
https://lists.nongnu.org/mailman/listinfo/igraph-help

Reply via email to