Hi, Each BSP task (GraphJobRunner) will process one input split (equal or - newly partitioned - smaller than the dfs block size). So, we can assume they will fit in memory. See HAMA-783.
The memory issue is related to the processing of Vertex messages and In-memory Queue. I expect that it will be solved with the next release. -- Best Regards, Edward J. Yoon @eddieyoon On Oct 19, 2013, at 8:15 AM, Yexi Jiang <[email protected]> wrote: > Hi, > > I recently read the source code about the graph processing. I found that > the edge information of a vertex is stored in ArrayList, which implicitly > assume that the edges of a vertex can be fit in the memory. Will such > implementation cause problem for some extreme large graphs? > > I also read the source code of apache giraph, it seems that they have > similar implementation (store the edges in an in-memory data structure). > > Regards, > Yexi
