Question about ChainMapper and ChainReducer

Tarandeep Singh Tue, 25 Nov 2008 10:28:26 -0800

Hi,

I would like to know how does ChainMapper and ChainReducer save IO ?


The doc says the output of first mapper becomes the input of second and so
on. So does this mean, the output of first map is *not* written to HDFS and
a second map process is started that operates on the data generated by first
map only?

In other words, is it safe to assume that if a map1 ran on node1 and
produced D1 output, then this D1 is stored locally on node1 and a second map
process (from chained map job) operates only on this local D1?

Thanks,
Taran

Question about ChainMapper and ChainReducer

Reply via email to