Re: Barrier between reduce and map of the next round

2010-02-09 Thread Felix Halim
Hi Arun, Ah yes.. the first comment by Owen O'Malley is exactly what I have in mind. Thanks, Felix Halim On Wed, Feb 10, 2010 at 3:04 AM, Arun C Murthy wrote: > Felix, you might want to follow > https://issues.apache.org/jira/browse/MAPREDUCE-1434. > We are discussing ideas very similar to wha

Re: Barrier between reduce and map of the next round

2010-02-09 Thread Arun C Murthy
Felix, you might want to follow https://issues.apache.org/jira/browse/MAPREDUCE-1434 . We are discussing ideas very similar to what you've just described over there. Arun On Feb 8, 2010, at 9:49 PM, Felix Halim wrote: Hi, Currently the barrier between r(i) and m(i+1) is the Job barrier. Th

Re: Barrier between reduce and map of the next round

2010-02-08 Thread Felix Halim
Hi, Currently the barrier between r(i) and m(i+1) is the Job barrier. That is, m(i+1) will be blocked until all r(i) finish (until Job i finish). I'm saying this blocking is not necessary if we can concatenate them all in a single Job as an endless chain. Therefore m(i+1) can start immediately ev

Re: Barrier between reduce and map of the next round

2010-02-08 Thread Amogh Vasekar
Hi, >>m1 | r1 m2 | r2 m3 | ... | r(K-1) mK | rK m(K+1) My understanding is it would be something like: m1|(r1 m2)| m(identity) | r2, if you combine the r(i) and m(i+1), because of the hard distinction between Rs & Ms. Amogh On 2/4/10 1:46 PM, "Felix Halim" wrote: Talking about barrier, curren

Re: Barrier between reduce and map of the next round

2010-02-04 Thread Felix Halim
Talking about barrier, currently there are barriers between anything: m1 | r1 | m2 | r2 | ... | mK | rK where | is the barrier. I'm saying that the barrier between ri and m(i+1) is not necessary. So it should go like this: m1 | r1 m2 | r2 m3 | ... | r(K-1) mK | rK m(K+1) Here the result of m(K

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Amogh Vasekar
>>However, from ri to m(i+1) there is an unnecessary barrier. m(i+1) should not >>need to wait for all reducers ri to finish, right? Yes, but r(i+1) cant be in the same job, since that requires another sort and shuffle phase ( barrier ). So you would end up doing, job(i) : m(i)r(i)m(i+1) . Job

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Felix Halim
Hi Ed, Currently my program is like this: m1,r1, m2,r2, ..., mK, rK. The barrier between mi and ri is acceptable since reducer has to wait for all map task to finish. However, from ri to m(i+1) there is an unnecessary barrier. m(i+1) should not need to wait for all reducers ri to finish, right?

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Ed Mazur
Felix, You can use ChainMapper and ChainReducer to create jobs of the form M+RM*. Is that what you're looking for? I'm not aware of anything that allows you to have multiple reduce functions without the job "barrier". Ed On Wed, Feb 3, 2010 at 9:41 PM, Felix Halim wrote: > Hi all, > > As far as