The question is: do you have 100GB of main-memory? How big are your messages going to be? How dense is the graph? Although we have out-of-core facilities, it looks to me not like a typical graph algorithm, and in particular not one that would particularly take advantage of Giraph compared to MapReduce. This is because it has a low number of iterations (two), and hence, in particular if you have memory constraints, it could work out pretty easily with MapReduce. Also, it looks to me like a map/reduce job, there the reducer could do the second iterations, but I could miss some details. As far as load-balancing is concerned, i guess it depends on your degree distribution. Having a "random" distribution of vertices through hash-partitioning should back you up, but if you have a bunch of nodes that are much more active, you could have some stranglers.
On Thu, May 2, 2013 at 2:12 AM, Hadoop Explorer <[email protected]>wrote: > I have an application that evaluate a graph using this algorithm: > > - use a parallel for loop to evaluate all nodes in a graph (to evaluate a > node, an image is read, and then result of this node is calculated) > > - use a second parallel for loop to evaluate all edges in the graph. The > function would take in results from both nodes of the edge, and then > calculate the answer for the edge > > The final result will consist of calculated results of each edge. So each > node, and each edge is essentially a job, and in this case, an edge is more > like a job than a message > > As you can see, the above algorithm would employ two map functions, but no > reduce function. The total data size can be very large (say 100GB). Also, > the workload of each node and each edge is highly irregular, and thus load > balancing mechanisms are essential. > > In this case, will giraph suit this application? if so, how will my > program like? And will giraph be able to strike the balance between a good > load balancing of the second map function, and minimizing data transfer of > the results from the first map function? > > > -- Claudio Martella [email protected]
