Hi, I would like to know how does ChainMapper and ChainReducer save IO ?
The doc says the output of first mapper becomes the input of second and so on. So does this mean, the output of first map is *not* written to HDFS and a second map process is started that operates on the data generated by first map only? In other words, is it safe to assume that if a map1 ran on node1 and produced D1 output, then this D1 is stored locally on node1 and a second map process (from chained map job) operates only on this local D1? Thanks, Taran