Thanks Harsh for the response. It very much answers what I was looking for.
Regards, Rahul On Wed, May 29, 2013 at 8:10 PM, Rahul Bhattacharjee < [email protected]> wrote: > Hi, > > I have one question related to the reduce phase of MR jobs. > > The intermediate outputs of map tasks are pulled in from the nodes which > ran map tasks to the node where reducers is going to run and those > intermediate data is written to the reducers local fs. My question is that > if there is a job processing huge amount of data and it has multiple > mappers but only one reducer , then its possible that the job would never > complete successfully as the single hosts disk might not be sufficient to > hold all the map outputs of the job. > > The job essentially would fail after retrying configured number of > attempts. > > Thanks, > Rahul >
