To be more detailed, it will be done during runtime in every of the tasks. Each task gets its block from HDFS and then starts partitioning.
2012/9/19 Thomas Jungblut <[email protected]> > Like Yuesheng Hu already mentioned in the GraphJobRunner > method loadVertices in the setup stage. > > > 2012/9/19 顾荣 <[email protected]> > >> Sorry, I sent last mail by mistake, it's unfinished in the last mail. >> >> Hi Thomas, >> >> I just read this part of code in the *submitJobInternal*() function of >> *org.apache.hama.bsp.BSPJobClient. >> *As you mentioned.raw BSPs have the opportunity to partition before the >> job, >> *// Create the splits for the job >> LOG.debug("Creating splits at " + >> fs.makeQualified(submitSplitFile)); >> if (job.getConf().get("bsp.input.partitioner.class") != null >> && !job.getConf() >> .getBoolean("hama.graph.runtime.partitioning", false)) { >> job = partition(job, maxTasks); >> maxTasks = job.getInt("hama.partition.count", maxTasks); >> }* >> >> By the way, if I do not partition the file on submitting stage. When and >> where will the vertex in the file will be partitioned and assigned to each >> task in Hama Graph? On the master node before running? Or on each groom >> server at the first superstep? >> >> Thanks >> >> Walker. >> >> >> >> 2012/9/19 顾荣 <[email protected]> >> >> > Hi Thomas, >> > >> > I just read this part of code in the *submitJobInternal*() function of >> *org.apache.hama.bsp.BSPJobClient. >> > *As you mentioned.raw BSPs have the opportunity to partition before the >> > job, >> > *// Create the splits for the job >> > LOG.debug("Creating splits at " + >> fs.makeQualified(submitSplitFile)); >> > if (job.getConf().get("bsp.input.partitioner.class") != null >> > && !job.getConf() >> > .getBoolean("hama.graph.runtime.partitioning", false)) { >> > job = partition(job, maxTasks); >> > maxTasks = job.getInt("hama.partition.count", maxTasks); >> > }* >> > >> > >> > 2012/9/19 Thomas Jungblut <[email protected]> >> > >> >> Hey, >> >> >> >> the file is getting split like Hadoop does it, defined by the >> inputformat. >> >> It will be partitioned during runtime, raw BSPs have the opportunity to >> >> partition before the job, but this is not soo scalable so we have not >> done >> >> this in graph algorithms. There is no load balancing besides the usual >> >> hash >> >> partitioning. However you can write your own partitioner to distribute >> the >> >> vertices, we are going to provide work stealing in the future so the >> load >> >> balancing gets better. >> >> >> >> >> >> 2012/9/19 Yuesheng Hu <[email protected]> >> >> >> >> > org.apache.hama.graph.GraphJobRunner is the most important class in >> >> should >> >> > read, also other classes in org.apache.hama.graph >> >> > >> >> > >> >> > 2012/9/19 顾荣 <[email protected]> >> >> > >> >> > > Hi All,I have some questions about your design in HamaGraph. Let me >> >> take >> >> > > the PageRank example to illustrate my questions. >> >> > > >> >> > > I have 3 Groom Servers each with 3 free BSP task nodes in my Hama >> >> > > cluster.The input file is as blow. >> >> > > >> >> > > "stackoverflow.com yahoo.com >> >> > > facebook.com twitter.com google.com nasa.gov >> >> > > yahoo.com nasa.gov stackoverflow.com >> >> > > twitter.com google.com facebook.com >> >> > > nasa.gov yahoo.com stackoverflow.com >> >> > > youtube.com google.com yahoo.com >> >> > > " >> >> > > In this case, there are 6 vertexs. How do you assign them among >> these >> >> > task >> >> > > nodes? Can it guarantee load balancing? And, Do you support a >> >> function to >> >> > > supply to customize their own vertex assignment policy? I am so >> >> confused >> >> > > with the tasks split part of Hama, it seems the same as Hadoop (by >> >> input >> >> > > splits) from its source code, but it works different. And does the >> >> task >> >> > > split part of HamaBSP is the same as HamaGraph? >> >> > > >> >> > > Would you please give some info about that? If you are busy to >> answer >> >> my >> >> > > questions, please kindly point it out to me that in which classes >> or >> >> > > functions of source code you implemented what I am confused about, >> I >> >> > think >> >> > > I read it more myself. >> >> > > >> >> > > Anyway,Thanks again. >> >> > > >> >> > > Walker >> >> > > >> >> > >> >> >> > >> > >> > >
