Hey, the file is getting split like Hadoop does it, defined by the inputformat. It will be partitioned during runtime, raw BSPs have the opportunity to partition before the job, but this is not soo scalable so we have not done this in graph algorithms. There is no load balancing besides the usual hash partitioning. However you can write your own partitioner to distribute the vertices, we are going to provide work stealing in the future so the load balancing gets better.
2012/9/19 Yuesheng Hu <[email protected]> > org.apache.hama.graph.GraphJobRunner is the most important class in should > read, also other classes in org.apache.hama.graph > > > 2012/9/19 顾荣 <[email protected]> > > > Hi All,I have some questions about your design in HamaGraph. Let me take > > the PageRank example to illustrate my questions. > > > > I have 3 Groom Servers each with 3 free BSP task nodes in my Hama > > cluster.The input file is as blow. > > > > "stackoverflow.com yahoo.com > > facebook.com twitter.com google.com nasa.gov > > yahoo.com nasa.gov stackoverflow.com > > twitter.com google.com facebook.com > > nasa.gov yahoo.com stackoverflow.com > > youtube.com google.com yahoo.com > > " > > In this case, there are 6 vertexs. How do you assign them among these > task > > nodes? Can it guarantee load balancing? And, Do you support a function to > > supply to customize their own vertex assignment policy? I am so confused > > with the tasks split part of Hama, it seems the same as Hadoop (by input > > splits) from its source code, but it works different. And does the task > > split part of HamaBSP is the same as HamaGraph? > > > > Would you please give some info about that? If you are busy to answer my > > questions, please kindly point it out to me that in which classes or > > functions of source code you implemented what I am confused about, I > think > > I read it more myself. > > > > Anyway,Thanks again. > > > > Walker > > >
