Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Thanks Song Bai and Ed for your replies, looking forward to Song's contributions and HAMA-843/816 to be done. Tommaso p.s.: I think we need a way of continuously benchmarking our trunk (e.g. setup 2+ machines in distributed mode and run tests / benchmarks against them via Jenkins, but I don't know if that's really feasible via ASF Jenkins). 2014/1/13 Edward J. Yoon edwardy...@apache.org Once HAMA-843 is committed, PageRank performance will be dramatically improved. The scalability issue is related with In-Memory VerticesInfo and Queue. DiskVerticesInfo is now available. Disk/Spilling Queue issues will be fixed soon. And also, Graph package's performance can be improved one more time with HAMA-816. On Mon, Jan 13, 2014 at 1:14 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: by the way: is there anyone aware of what kind of failures were related to PageRank failures highlighted in the mentioned slides (or know who can we ask)? Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Not very sure, but it seems JUnitBenchmarks can be integrated to Jekins. On 13 January 2014 17:05, Tommaso Teofili tommaso.teof...@gmail.com wrote: Thanks Song Bai and Ed for your replies, looking forward to Song's contributions and HAMA-843/816 to be done. Tommaso p.s.: I think we need a way of continuously benchmarking our trunk (e.g. setup 2+ machines in distributed mode and run tests / benchmarks against them via Jenkins, but I don't know if that's really feasible via ASF Jenkins). 2014/1/13 Edward J. Yoon edwardy...@apache.org Once HAMA-843 is committed, PageRank performance will be dramatically improved. The scalability issue is related with In-Memory VerticesInfo and Queue. DiskVerticesInfo is now available. Disk/Spilling Queue issues will be fixed soon. And also, Graph package's performance can be improved one more time with HAMA-816. On Mon, Jan 13, 2014 at 1:14 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: by the way: is there anyone aware of what kind of failures were related to PageRank failures highlighted in the mentioned slides (or know who can we ask)? Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
by the way: is there anyone aware of what kind of failures were related to PageRank failures highlighted in the mentioned slides (or know who can we ask)? Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
I also encounter that failures in running hama-0.6.0 . I think there are two problem in hama. (1) because hama loads data into memory to process.To large data,that may cause jvm memory overflow. the sulotion is that you can configure the bsp.child.java.opts as large as your computer allows in hama-site.xml .for examples, property namebsp.child.java.opts/name value-Xmx4096m/value /property (2) To pagerank, one BSPPeer may send larger messages than SSSP after finishing one SuperStep(as you can see, to the same data largeEWD,SSSP is successful but PageRank is failed), and the User-defined combiner is not used to reduce the message amount when sending messages,so the RPC may occur error because of large amount messages. the solution is that you can modify the org.apache.hama.graph.GraphJobRunner class and org.apache.hama.graph.Vertex to add an combiner when sending messages. By the above two methods, i have solved the large big data problem. good luckļ¼ On Mon, Jan 13, 2014 at 12:14 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: by the way: is there anyone aware of what kind of failures were related to PageRank failures highlighted in the mentioned slides (or know who can we ask)? Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Once HAMA-843 is committed, PageRank performance will be dramatically improved. The scalability issue is related with In-Memory VerticesInfo and Queue. DiskVerticesInfo is now available. Disk/Spilling Queue issues will be fixed soon. And also, Graph package's performance can be improved one more time with HAMA-816. On Mon, Jan 13, 2014 at 1:14 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: by the way: is there anyone aware of what kind of failures were related to PageRank failures highlighted in the mentioned slides (or know who can we ask)? Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Dear Edward J. Yoon I have read and modify most of source code of hama-0.6.0, for example, 1. add combiner when the peer sends messages to other peers; 2. the messages send from Superstep i to Superstep (i+1) in the same BspPeer don't use the default RPC,but through memory. 3. I hava implemented some algorithm,such as WCC,topk and Incremental PageRank. Unforunately I have found some mistakes in the source code.I think the important is that: In Pregel paper, the job will terminates when all vertices are simultaneously inactive and there are no messages in transit. but hama only consider the vertices are active or not. I want to be a contributor of hama project.now I am studying the document of hama. can you give me some suggest,waiting for your reply. On Fri, Jan 10, 2014 at 1:34 PM, Edward J. Yoon edwardy...@apache.orgwrote: Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Hello, First of all, please create a Jira id[1] if you don't already have one. then you can create a JIRA ticket for starting to contribute ideas and patches[3]. We look forward your contributions! 1. https://issues.apache.org/jira/secure/Signup!default.jspa 2. https://issues.apache.org/jira/browse/HAMA 3. http://wiki.apache.org/hama/HowToContribute On Fri, Jan 10, 2014 at 5:06 PM, song bai baison...@gmail.com wrote: Dear Edward J. Yoon I have read and modify most of source code of hama-0.6.0, for example, 1. add combiner when the peer sends messages to other peers; 2. the messages send from Superstep i to Superstep (i+1) in the same BspPeer don't use the default RPC,but through memory. 3. I hava implemented some algorithm,such as WCC,topk and Incremental PageRank. Unforunately I have found some mistakes in the source code.I think the important is that: In Pregel paper, the job will terminates when all vertices are simultaneously inactive and there are no messages in transit. but hama only consider the vertices are active or not. I want to be a contributor of hama project.now I am studying the document of hama. can you give me some suggest,waiting for your reply. On Fri, Jan 10, 2014 at 1:34 PM, Edward J. Yoon edwardy...@apache.orgwrote: Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
cool stuff, looking forward to your contributions! Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Hello, First of all, please create a Jira id[1] if you don't already have one. then you can create a JIRA ticket for starting to contribute ideas and patches[3]. We look forward your contributions! 1. https://issues.apache.org/jira/secure/Signup!default.jspa 2. https://issues.apache.org/jira/browse/HAMA 3. http://wiki.apache.org/hama/HowToContribute On Fri, Jan 10, 2014 at 5:06 PM, song bai baison...@gmail.com wrote: Dear Edward J. Yoon I have read and modify most of source code of hama-0.6.0, for example, 1. add combiner when the peer sends messages to other peers; 2. the messages send from Superstep i to Superstep (i+1) in the same BspPeer don't use the default RPC,but through memory. 3. I hava implemented some algorithm,such as WCC,topk and Incremental PageRank. Unforunately I have found some mistakes in the source code.I think the important is that: In Pregel paper, the job will terminates when all vertices are simultaneously inactive and there are no messages in transit. but hama only consider the vertices are active or not. I want to be a contributor of hama project.now I am studying the document of hama. can you give me some suggest,waiting for your reply. On Fri, Jan 10, 2014 at 1:34 PM, Edward J. Yoon edwardy...@apache.org wrote: Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
Re: FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
thanks ,I will do my best. 2014/1/10, Tommaso Teofili tommaso.teof...@gmail.com: cool stuff, looking forward to your contributions! Tommaso 2014/1/10 Edward J. Yoon edwardy...@apache.org Hello, First of all, please create a Jira id[1] if you don't already have one. then you can create a JIRA ticket for starting to contribute ideas and patches[3]. We look forward your contributions! 1. https://issues.apache.org/jira/secure/Signup!default.jspa 2. https://issues.apache.org/jira/browse/HAMA 3. http://wiki.apache.org/hama/HowToContribute On Fri, Jan 10, 2014 at 5:06 PM, song bai baison...@gmail.com wrote: Dear Edward J. Yoon I have read and modify most of source code of hama-0.6.0, for example, 1. add combiner when the peer sends messages to other peers; 2. the messages send from Superstep i to Superstep (i+1) in the same BspPeer don't use the default RPC,but through memory. 3. I hava implemented some algorithm,such as WCC,topk and Incremental PageRank. Unforunately I have found some mistakes in the source code.I think the important is that: In Pregel paper, the job will terminates when all vertices are simultaneously inactive and there are no messages in transit. but hama only consider the vertices are active or not. I want to be a contributor of hama project.now I am studying the document of hama. can you give me some suggest,waiting for your reply. On Fri, Jan 10, 2014 at 1:34 PM, Edward J. Yoon edwardy...@apache.org wrote: Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
FYI, Comparison and Evaluation of Open Source Implementations of Pregel and Related Systems
Just FYI, https://cs.uwaterloo.ca/~kdaudjee/courses/cs848/slides/proj/F13/JPV.pdf -- Best Regards, Edward J. Yoon @eddieyoon