Re: When does reducer read mapper's intermediate result?

Chris Douglas Mon, 14 Jul 2008 16:05:01 -0700

Not quite; the intermediate output is written to the local disk on thenode executing MapTask and fetched over HTTP by the ReduceTask. TheReduceTask need only wait for the MapTask to complete successfullybefore fetching its output, but it cannot start before all MapTaskshave finished. The intermediate output is sorted, so the ReduceTaskonly needs to merge the output produced by the map and group by key(using the grouping comparator). -C


On Jul 14, 2008, at 3:59 PM, Mori Bellamy wrote:

i'm pretty sure that the reducer waits for all of the map tasks'output to be written to HDFS (or else i nee no use for the Combinerclass). i'm not sure about your second question though. my guttells me "no"
On Jul 14, 2008, at 3:50 PM, Kevin wrote:
Hi, there,

I am interested in the implementation details of hadoop mapred. In
particular, does the reducer wait till a map task ends and then fetch
the output (key-value pairs)? If so, is the very file produced by a
mapper for the reducer sorted before reducer gets it? (which means
that the reducer only needs to do merge sort when it gets all the
intermediate files from different mappers).

Best,
-Kevin

Re: When does reducer read mapper's intermediate result?

Reply via email to