Ohh ,Thanks a lot Harsh. Exactly what I was looking for. I wanted to create different file.out's for different reducers. Something like file.out.1 for reducer 1, file.out.2 for reducer etc. Is it possible to do this in the MapReduce program or I need to tweak some Hadoop source files for that? Thanks.
On Sun, Aug 19, 2012 at 7:02 AM, Harsh J <[email protected]> wrote: > Hey Pavan, > > Yes you've got it almost right on how file.out is served to each > reducer. See the code at > > http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java?view=markup > (Method under L502:L565 that sends data for a specific > reduce/partition ID (integer)). > > On Sun, Aug 19, 2012 at 9:05 AM, Pavan Kulkarni <[email protected]> > wrote: > > Hi, > > > > I was trying to understand how exactly the reducers find out how to > fetch > > the data of its own partition from Map nodes. > > During the executions of MapReduce, I see that *file.out* is created on > Map > > nodes, so my question is how does a reducer > > know what part of file.out to fetch? Is the *file.out.index* play any > role? > > Any help is appreciated .Thanks > > > > > > > > --With Regards > > Pavan Kulkarni > > > > -- > Harsh J > -- --With Regards Pavan Kulkarni
