[ 
https://issues.apache.org/jira/browse/MADLIB-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429013#comment-16429013
 ] 

Himanshu Pandey commented on MADLIB-1084:
-----------------------------------------

Frank McQuillan  Jingyi Mei , 

I tested the pagerank with this GUC set to OFF  

{code}

set optimizer_enable_tablescan = off;

{code}

 
When the above GUC is ON, ORCA is doing a Table scan and when It's OFF, it's 
going on Seq. Scan. 
And when it's turned OFF, the Install check time ( also the query) runtime has 
reduced to half: 

*GPDB 5.6.1 (CentOS Linux release 7.4.1708 (Core) )* 

{code}

TEST CASE RESULT|Module: glm|gamma.sql_in|PASS|Time: 13015 milliseconds

TEST CASE RESULT|Module: glm|binomial.sql_in|PASS|Time: 4352 milliseconds

TEST CASE RESULT|Module: graph|wcc.sql_in|PASS|Time: 2425 milliseconds

TEST CASE RESULT|Module: graph|sssp.sql_in|PASS|Time: 4210 milliseconds

TEST CASE RESULT|Module: graph|pagerank.sql_in|PASS|Time: 76533 milliseconds

TEST CASE RESULT|Module: graph|measures.sql_in|PASS|Time: 2181 milliseconds

TEST CASE RESULT|Module: graph|hits.sql_in|PASS|Time: 2986 milliseconds

TEST CASE RESULT|Module: graph|bfs.sql_in|PASS|Time: 5325 milliseconds

TEST CASE RESULT|Module: graph|apsp.sql_in|PASS|Time: 1683 milliseconds

TEST CASE RESULT|Module: linear_systems|sparse_linear_sytems.sql_in|PASS|Time: 
908 milliseconds

{code}

 

Query Results

 

With Grouping : 

{code}

gpadmin=# SELECT madlib.pagerank(  'vertex', 'id',  'edge',  'src=src, 
dest=dest', 'pagerank_out',  NULL,  NULL,  NULL, 'user_id', '{1,3}');

pagerank

----------
(1 row)

Time: 49630.882 ms
{code}

 Without Grouping : 

{code}

gpadmin=# SELECT madlib.pagerank(  'vertex', 'id',  'edge',  'src=src, 
dest=dest', 'pagerank_out',  NULL,  NULL,  NULL, NULL, '{1,3}');

pagerank
----------

(1 row)

Time: 7767.580 ms
{code}



I am not sure if we can set the GUC's internally in MADlib. So just an FYI. 


> Graph - Personalized PageRank
> -----------------------------
>
>                 Key: MADLIB-1084
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1084
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Graph
>            Reporter: Frank McQuillan
>            Assignee: Himanshu Pandey
>            Priority: Major
>             Fix For: v1.14
>
>         Attachments: GraphTest.py
>
>
> Personalized PageRank which is a variant of regular PageRank.
> Please refer to  
> [http://madlib.apache.org/docs/latest/group__grp__pagerank.html] as a 
> starting point.
> Reference:
>  Neighborhood Formation and Anomaly Detection in Bipartite Graphs
>  [http://www.cs.cmu.edu/~deepay/mywww/papers/icdm05.pdf]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to