[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-16 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/291
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/567/



---


[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-16 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/291
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/566/



---


[GitHub] madlib issue #294: Pagerank: Remove duplicate entries from grouping output

2018-07-16 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/294
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/564/



---


[GitHub] madlib issue #294: Pagerank: Remove duplicate entries from grouping output

2018-07-16 Thread njayaram2
Github user njayaram2 commented on the issue:

https://github.com/apache/madlib/pull/294
  
Thank you for the comments @jingyimei , have pushed a commit with a new 
dev-check test.


---


[GitHub] madlib issue #294: Pagerank: Remove duplicate entries from grouping output

2018-07-16 Thread jingyimei
Github user jingyimei commented on the issue:

https://github.com/apache/madlib/pull/294
  
The current master has another issue:
When you run
```
DROP TABLE IF EXISTS vertex, "EDGE";
CREATE TABLE vertex(
id INTEGER
);
CREATE TABLE "EDGE"(
src INTEGER,
dest INTEGER,
user_id INTEGER
);
INSERT INTO vertex VALUES
(0),
(1),
(2);
INSERT INTO "EDGE" VALUES
(0, 1, 1),
(0, 2, 1),
(1, 2, 1),
(2, 1, 1),
(0, 1, 2);


DROP TABLE IF EXISTS pagerank_ppr_grp_out;
DROP TABLE IF EXISTS pagerank_ppr_grp_out_summary;
SELECT pagerank(
'vertex', -- Vertex table
'id', -- Vertix id column
'"EDGE"', -- "EDGE" table
'src=src, dest=dest', -- "EDGE" args
'pagerank_ppr_grp_out', -- Output table of PageRank
NULL, -- Default damping factor (0.85)
NULL, -- Default max iters (100)
NULL, -- Default Threshold 
'user_id');
```

you will get the following result:
```
madlib=# select * from pagerank_ppr_grp_out order by user_id, id; user_id | 
id | pagerank
-++---
1 | 0 | 0.05
1 | 0 | 0.05
1 | 1 | 0.614906399170753
1 | 2 | 0.614906399170753
2 | 0 | 0.075
2 | 1 | 0.13875
(6 rows)
```

where for user_id=1 the pagerank scores don't sum up to 1 where they should 
have to. This PR actually fix this issue and gives the right number. However 
the dev check didn't have a case to catch this issue before. Suggest to add 
this corner case in dev check to test future changes.


---


[GitHub] madlib pull request #290: madpack: Add madpack option to run unit tests.

2018-07-16 Thread njayaram2
Github user njayaram2 closed the pull request at:

https://github.com/apache/madlib/pull/290


---


[GitHub] madlib issue #289: RF: Add impurity variable importance

2018-07-16 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/289
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/563/



---


[GitHub] madlib issue #294: Pagerank: Remove duplicate entries from grouping output

2018-07-16 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/294
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/562/



---


[GitHub] madlib pull request #294: Pagerank: Remove duplicate entries from grouping o...

2018-07-16 Thread njayaram2
GitHub user njayaram2 opened a pull request:

https://github.com/apache/madlib/pull/294

Pagerank: Remove duplicate entries from grouping output

JIRA: MADLIB-1229
JIRA: MADLIB-1253

Fixes the missing output for complete graphs bug as well.

Co-authored-by: Nandish Jayaram 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/madlib/madlib bug/pagerank-dup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/madlib/pull/294.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #294


commit 1b55acac9d5550e0a74fa46ec0ab4842d089ac1c
Author: Orhan Kislal 
Date:   2018-07-14T00:09:11Z

Pagerank: Remove duplicate entries from grouping output

JIRA: MADLIB-1229
JIRA: MADLIB-1253

Fixes the missing output for complete graphs bug as well.

Co-authored-by: Nandish Jayaram 




---


[GitHub] madlib pull request #288: Jira:1239: Converts features from multiple columns...

2018-07-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/madlib/pull/288


---