Hi all,

As far as I know, a barrier exists between map and reduce function in
one round of MR. There is another barrier for the reducer to end the
job for that round. However if we want to run in several rounds using
the same map and reduce functions, then the barrier between reduce and
the map of the next round is NOT necessary, right? Since the reducer
only output a single value for each key. This reducer may as well run
a map task for the next round immediately rather than waiting for all
reducer to finish. This way, the utilization of the machines between
rounds can be improved.

Is there a setting in Hadoop to do that?

Felix Halim

Reply via email to