Praveen, response inline.  Hope it's helpful.

On 6/30/12 10:47 AM, Praveen Sripati wrote:
Could someone respond to the below mail please?

Thanks,
Praveen

On Thu, Jun 28, 2012 at 7:04 PM, Praveen Sripati
<[email protected]>wrote:

During the 24th minute of the recent Hadoop Summit Video [1] Avery Ching
talks about how Giraph is made scalable. I am interested in Hama which is
also based on the BSP model and would like to know more details on how
Giraph is made scalable.

Basically, at the end of each super step, the BSP tasks sends some metrics
to the master and the master partitions the data in the most loaded BSP
tasks and uses the free map available slot to process them.

1) Where is the code for the above logic? I am new to Giraph.
See BspWorker#finishSuperstep()

2) What is the logic behind the partitioning of the data in the master
after the super step? Let's say that the data has been partitioned using
Hash partitioning.
See GraphPartitionerFactory
3) Similarly will Giraph also scale down? Will the partitions be merged?

This is totally up to the implementation of GraphPartitionerFactory.

Thanks,
Praveen

[1] - http://www.youtube.com/watch?v=b5Qmz4zPj-M



Reply via email to