Sure Hassan, thanks for the reply! On Fri, Mar 4, 2016 at 2:46 PM, Hassan Eslami <[email protected]> wrote:
> Anirudh, > > 1) AFAIK, the load balancing mechanism is not implemented in Giraph. > Although, the mechanism for partition migration is implemented. You may > want to use that mechanism to implement your own load-balancer insider the > framework. You can take a look at BspServiceWorker#exchangeVertexPartitions > for this purpose. > > 2) i. Look at PartitionUtils#computePartitionCount. Generally, if you have > n machines, the number of partitions would be n*n (each worker will get n > partitions). You can set the total number of partitions by flag > -Dgiraph.userPartitionCount (for instance, you can say > -Dgiraph.userPartitionCount=100, to have 100 partitions in total). > ii. Number of partitions are generally remain constant throughout the > computation. It is computed once in the beginning of the computation, and > will be the same for the rest of the computation. > iii. There are statistics (such as how many vertices each partition has, > how much time it took to process each partition, etc. For instance you can > look at PartitionStats class) which are mostly used for logging. > > Best, > Hassan > > On Thu, Mar 3, 2016 at 1:46 PM, Anirudh Perugu < > [email protected]> wrote: > >> Hi, >> >> I am a giraph newbie & have read how giraph works but I have a couple of >> questions. >> >> 1. If a machine has too much work to do, is it possible to migrate work >> to another machine for faster computation? (or is this handled by >> partitions from the master) >> >> (Plz view the diagram below) >> 2. i. How are the number of partitions decided? >> ii. What kind of Statistics are stored, how do they help the master to >> choose the number of partitions for the next superstep? >> iii. These statistics are in memory (because they cannot be to the disk), >> am I correct? >> [image: Inline image 2] >> >> Thanks, >> Anirudh >> > >
