Hi,
I am running a map reduce program which reads data from a file,
processes it and writes the output into another file.
i run 4 maps and 4 reduces, and my output is as follows:
09/08/27 17:34:37 INFO mapred.JobClient: Running job: job_200908271142_0026
09/08/27 17:34:38 INFO
But reducer can do some preparations during map process. It can
distribute map output across nodes that will work as reducers.
Copying and sorting map output is also time costuming process (maybe,
more consuming than reduce itself). For example, piece job run log on
40node cluster
could be