Github user jingyimei commented on the issue:
https://github.com/apache/madlib/pull/294
The current master has another issue:
When you run
```
DROP TABLE IF EXISTS vertex, "EDGE";
CREATE TABLE vertex(
id INTEGER
);
CREATE TABLE "EDGE"(
src INTEGER,
dest INTEGER,
user_id INTEGER
);
INSERT INTO vertex VALUES
(0),
(1),
(2);
INSERT INTO "EDGE" VALUES
(0, 1, 1),
(0, 2, 1),
(1, 2, 1),
(2, 1, 1),
(0, 1, 2);
DROP TABLE IF EXISTS pagerank_ppr_grp_out;
DROP TABLE IF EXISTS pagerank_ppr_grp_out_summary;
SELECT pagerank(
'vertex', -- Vertex table
'id', -- Vertix id column
'"EDGE"', -- "EDGE" table
'src=src, dest=dest', -- "EDGE" args
'pagerank_ppr_grp_out', -- Output table of PageRank
NULL, -- Default damping factor (0.85)
NULL, -- Default max iters (100)
NULL, -- Default Threshold
'user_id');
```
you will get the following result:
```
madlib=# select * from pagerank_ppr_grp_out order by user_id, id; user_id |
id | pagerank
-++---
1 | 0 | 0.05
1 | 0 | 0.05
1 | 1 | 0.614906399170753
1 | 2 | 0.614906399170753
2 | 0 | 0.075
2 | 1 | 0.13875
(6 rows)
```
where for user_id=1 the pagerank scores don't sum up to 1 where they should
have to. This PR actually fix this issue and gives the right number. However
the dev check didn't have a case to catch this issue before. Suggest to add
this corner case in dev check to test future changes.
---