Moving mapreduce specific question to [email protected]

All map task related execution starts at org.apache.hadoop.mapred.MapTask.

For your specific question, you can see MapTask.runNewMapper() - > NewOutputCollector -> MapOutputBuffer.

HTH,
+vinod


On Tuesday 17 August 2010 04:17 PM, Rahul.V. wrote:
Hi,
Ive read that the intermediate map output is written to the disk at the
regular intervals. Infact Ive read that there are background threads which
spill the data onto disk whenever it crosses the threshold.[Source:Hadoop:
The Definitive Guide.]
Ive tried to dig into the code a couple of times to see where exactly this
is happening. If any of you know where is it, can you kindly let me know the
filename and package name where I can find it?


Reply via email to