Github user pfontana3w2 commented on the pull request:

    https://github.com/apache/spark/pull/2100#issuecomment-53118652
  
    Do take a look. I noticed that with this patch, dangling nodes no longer 
have page ranks equal to their reset probabilities, so there may not be an 
error, or my solution may not be the right one. Here is a small test that I did.
    
    Edge Table:
    
    | Node1  | Node2  |
    | ------------- | ------------- |
    1 | 2
    1 | 3
    3 | 2
    3 | 4
    5 | 3
    6 | 7
    7 | 8
    8 | 9
    9 | 7
    
    Node Table:
    
    | NodeID  | NodeName  |
    | ------------- | ------------- |
    a | 1
    b | 2
    c | 3
    d | 4
    e | 5
    f | 6
    g | 7
    h | 8
    i | 9
    j.longaddress.com | 10
    
    Page Ranks Before Patch:
    
    (4,0.29503124999999997)
    (1,0.15)
    (6,0.15)
    (3,0.34124999999999994)
    (7,1.3299054047985106)
    (9,1.2381240056453071)
    (8,1.2803346052504254)
    (10,0.15)
    (5,0.15)
    (2,0.35878124999999994)
    
    
    Page Ranks After Patch:
    
    (4,0.2488125)
    (1,0.3)
    (6,0.3)
    (3,0.5325)
    (7,0.23925000000000002)
    (9,0.19335000000000002)
    (8,0.45600000000000007)
    (10,0.3)
    (5,0.3)
    (2,0.2488125)
    
    Note that node 10 is not in the edge table



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to