Re: Barrier between reduce and map of the next round

2010-02-09 Thread Felix Halim
Hi Arun, Ah yes.. the first comment by Owen O'Malley is exactly what I have in mind. Thanks, Felix Halim On Wed, Feb 10, 2010 at 3:04 AM, Arun C Murthy wrote: > Felix, you might want to follow > https://issues.apache.org/jira/browse/MAPREDUCE-1434. > We are discussing ideas very similar to wha

Re: Barrier between reduce and map of the next round

2010-02-09 Thread Arun C Murthy
Felix, you might want to follow https://issues.apache.org/jira/browse/MAPREDUCE-1434 . We are discussing ideas very similar to what you've just described over there. Arun On Feb 8, 2010, at 9:49 PM, Felix Halim wrote: Hi, Currently the barrier between r(i) and m(i+1) is the Job barrier. Th

Re: Barrier between reduce and map of the next round

2010-02-08 Thread Felix Halim
Hi, Currently the barrier between r(i) and m(i+1) is the Job barrier. That is, m(i+1) will be blocked until all r(i) finish (until Job i finish). I'm saying this blocking is not necessary if we can concatenate them all in a single Job as an endless chain. Therefore m(i+1) can start immediately ev

Re: Barrier between reduce and map of the next round

2010-02-08 Thread Amogh Vasekar
Hi, >>m1 | r1 m2 | r2 m3 | ... | r(K-1) mK | rK m(K+1) My understanding is it would be something like: m1|(r1 m2)| m(identity) | r2, if you combine the r(i) and m(i+1), because of the hard distinction between Rs & Ms. Amogh On 2/4/10 1:46 PM, "Felix Halim" wrote: Talking about barrier, curren

Re: Barrier between reduce and map of the next round

2010-02-04 Thread Felix Halim
Talking about barrier, currently there are barriers between anything: m1 | r1 | m2 | r2 | ... | mK | rK where | is the barrier. I'm saying that the barrier between ri and m(i+1) is not necessary. So it should go like this: m1 | r1 m2 | r2 m3 | ... | r(K-1) mK | rK m(K+1) Here the result of m(K

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Amogh Vasekar
>>However, from ri to m(i+1) there is an unnecessary barrier. m(i+1) should not >>need to wait for all reducers ri to finish, right? Yes, but r(i+1) cant be in the same job, since that requires another sort and shuffle phase ( barrier ). So you would end up doing, job(i) : m(i)r(i)m(i+1) . Job

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Felix Halim
Hi Ed, Currently my program is like this: m1,r1, m2,r2, ..., mK, rK. The barrier between mi and ri is acceptable since reducer has to wait for all map task to finish. However, from ri to m(i+1) there is an unnecessary barrier. m(i+1) should not need to wait for all reducers ri to finish, right?

Re: Barrier between reduce and map of the next round

2010-02-03 Thread Ed Mazur
Felix, You can use ChainMapper and ChainReducer to create jobs of the form M+RM*. Is that what you're looking for? I'm not aware of anything that allows you to have multiple reduce functions without the job "barrier". Ed On Wed, Feb 3, 2010 at 9:41 PM, Felix Halim wrote: > Hi all, > > As far as

Barrier between reduce and map of the next round

2010-02-03 Thread Felix Halim
Hi all, As far as I know, a barrier exists between map and reduce function in one round of MR. There is another barrier for the reducer to end the job for that round. However if we want to run in several rounds using the same map and reduce functions, then the barrier between reduce and the map of