Thanks a lot Amar. As usual, you have cleared a lot of the haze in head :)
regards, On Mon, Mar 3, 2008 at 9:32 PM, Amar Kamat <[EMAIL PROTECTED]> wrote: > On Mon, 3 Mar 2008, Ahmad Humayun wrote: > > > Hello everyone, > > > > I have a question about the intermediate data output by the map > function. I > > wanted to know that does this intermediate data get written to the HDFS > or > > it stays in the node's local memory? > It stored on the local disk. > > According to the MapReduce paper, the > > intermediate data is run through a hash function which maps every key to > a > > given a reduce worker. So how does this whole process happen? Does the > map > > worker write the intermediate data to the HDFS and then tells the > JobTracker > > (Master) which Reduce worker should be allotted this data? Or the Map > worker > > keeps the intermediate data in memory and makes an RPC call directly to > the > > reduce worker (which was figured out by the hash function) to transfer > the > > intermediate data? > > > The map uses something called the partitioner. Each map writes they <k,v> > pair for the appropriate reducer determined by this partition function. In > the end there is a map output file which is nothing but outputs for each > reduce function concatenated in sequence based on reduce id. The hash > function you are talking about is the partition function in HADOOP. > JobTracker is not involved in these things. Since the map has generated > output for each reducer, whenever a reducer requests for a map output the > tracker indexes into the mapouput file and sends the appropriate map > output chunk. > > It will be great if you can point me to the place, where these > > functionalities are implemented in hadoop. > See TaskTracker$MapOutputServlet. > > Plus it will be great if you can also point me to the place where the > hash > > function is in map? > See o.a.h.m.Partitioner.java > > > > thanks again for the great support on this mailing list. > > > > > > regards, > > > -- Ahmad Humayun Research Assistant Computer Science Dpt., LUMS +92 321 4457315
