Hey Pavan, Yes you've got it almost right on how file.out is served to each reducer. See the code at http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java?view=markup (Method under L502:L565 that sends data for a specific reduce/partition ID (integer)).
On Sun, Aug 19, 2012 at 9:05 AM, Pavan Kulkarni <[email protected]> wrote: > Hi, > > I was trying to understand how exactly the reducers find out how to fetch > the data of its own partition from Map nodes. > During the executions of MapReduce, I see that *file.out* is created on Map > nodes, so my question is how does a reducer > know what part of file.out to fetch? Is the *file.out.index* play any role? > Any help is appreciated .Thanks > > > > --With Regards > Pavan Kulkarni -- Harsh J
