Re: Comparing BSP and MR

2011-12-10 Thread Avery Ching
You can certainly implement BSP on top of a MapReduce implementation. But this is going to be very very expensive. Consider that all communication in MapReduce will go through the phase of storing map outputs locally (disk) before being send to the reducer. Also, consider than the entire gra

Re: Comparing BSP and MR

2011-12-10 Thread Praveen Sripati
Avery, > Communication between mappers is not part of the MapReduce computing model. Therefore, it doesn't make sense for them to include it as it would unnecessarily complicate the fault-tolerance recovery. I agree that it doesn't make sense to complicate things by introducing communication bet

Re: Comparing BSP and MR

2011-12-10 Thread Jakob Homan
> There must have been definitely some thought around this. Yes, there was some thought put into the design before writing - in Hadoop's case - hundreds of thousands of lines of code, and, for Giraph, thousands so far. I invite you to read the MapReduce paper (http://research.google.com/archive/m

Re: Comparing BSP and MR

2011-12-10 Thread Avery Ching
On 12/9/11 10:22 PM, Praveen Sripati wrote: Jack, > Giraph maps do communicate: via RPC. This is done repeatedly in every mapper, during the compute phase. This is something that is not normal to MapReduce, it is special to Giraph. There must have been definitely some thought around this.

Re: Comparing BSP and MR

2011-12-09 Thread Praveen Sripati
Jack, > Giraph maps do communicate: via RPC. This is done repeatedly in every mapper, during the compute phase. This is something that is not normal to MapReduce, it is special to Giraph. There must have been definitely some thought around this. But, we can also have a mapper correspond to just

Re: Comparing BSP and MR

2011-12-09 Thread Jake Mannix
On Fri, Dec 9, 2011 at 8:16 PM, Praveen Sripati wrote: > Jake, > > > > Let's not crosspost, please, it make the thread of conversation totally > opaque as to who is talking about what. > > Agree. I got it after the OP. > > > > There is only one set of map tasks for the Giraph job - those > long-ru

Re: Comparing BSP and MR

2011-12-09 Thread Praveen Sripati
Jake, > Let's not crosspost, please, it make the thread of conversation totally opaque as to who is talking about what. Agree. I got it after the OP. > There is only one set of map tasks for the Giraph job - those long-running map tasks run possibly many supersteps. OK. But, map tasks don't com

Re: Comparing BSP and MR

2011-12-09 Thread Jake Mannix
[hama-user to bcc:] Let's not crosspost, please, it make the thread of conversation totally opaque as to who is talking about what. On Fri, Dec 9, 2011 at 1:42 AM, Praveen Sripati wrote: > Thanks to Thomas and Avery for the response. > > > For Giraph you are quite correct, all the stuff is submi

Re: Comparing BSP and MR

2011-12-09 Thread Praveen Sripati
Thanks to Thomas and Avery for the response. > For Giraph you are quite correct, all the stuff is submitted as a MR job. But a full map stage is not a superstep, the whole computation is a done in one mapping phase. So a map task in MR corresponds to a computation phase in a superstep. Once the c

Re: Comparing BSP and MR

2011-12-08 Thread Avery Ching
Hi Praveen, Answers inline. Hope that helps! Avery On 12/8/11 10:16 PM, Praveen Sripati wrote: Hi, I know about MapReduce/Hadoop and trying to get myself around BSP/Hama-Giraph by comparing MR and BSP. - Map Phase in MR is similar to Computation Phase in BSP. BSP allows for process to ex

Re: Comparing BSP and MR

2011-12-08 Thread Avery Ching
Hi Praveen, Answers inline. Hope that helps! Avery On 12/8/11 10:16 PM, Praveen Sripati wrote: Hi, I know about MapReduce/Hadoop and trying to get myself around BSP/Hama-Giraph by comparing MR and BSP. - Map Phase in MR is similar to Computation Phase in BSP. BSP allows for process to ex