+1 On Wed, Jul 4, 2012 at 12:46 PM, Praveen Sripati <[email protected]> wrote: > The o.a.hama.graph.Aggregator interface has the following method > > public void aggregate(VERTEX vertex, M value); > > Couple of things > > 1. Why send the value when it can be got from the vertex? > > 2. Why send the complete vertex? In case of semi clustering as described in > the Google Pregel paper, each vertex maintains a list of semi clusters and > the data associated with it. Since, all the vertices are sent to the master > in each superstep this might be a bottleneck with huge graphs. > > 3. o.a.giraph.graph.Aggregator class has a better interface where only the > values to be aggregated are sent over the wire. > > 4. Also, will there be a requirement to do the aggregation in only some > super steps and not all. Let say, to calculate the number of vectors/edges > in the input and the output graph. In this scenario, aggregation in the > first and last super step should be good. > > Any thoughts? Should I open a JIRA for the same. > > Thanks, > Praveen
-- Best Regards, Edward J. Yoon @eddieyoon
