Hi, I would like to discuss the plan that I think.
First, I participated in hama in order to contribute graph package. I plan to develop graph package (called Angrapa) for large scale graph data running on shared-nothing architecture. Many of people think about that Angrapa may be appropriate in BSP model, like pregel, and I also agree with that thinking. In addition, the pregel seems to be proven by ACM SIGMOD Conf. 2010. The BSP package is the base of Angrapa. Thus, we first should complete BSP package. Besides, we should be concerned with two design considerations caused by large-scale data and shared-nothing architecture, recently called cloud computing environments. They are heterogeneity and fault tolerance respectively. In addition, it is important to design BSP to be general purpose since it will be used for matrix package and other problems. After pregel paper is published, we should embrace the techniques discussed in the pregel paper. In sum, * We should complete BSP package, general purpose as possible. * BSP package has to include considerations of both heterogeneity and fault tolerance. I need advices. Best wishes, -- Hyunsik Choi Database & Information Systems Group, Korea Univ. http://diveintodata.org
