Re: do maps explicitly finish before reduces begin in hadoop

Amar Kamat Mon, 03 Mar 2008 07:11:40 -0800

In case of HADOOP the reducers can start along with maps because theshuffle phase can start and pull map outputs whenever it can. Thisoverlaps the map phase and shuffle phase. The actual reduce happens only afterall the maps have completed and the map output meant for the reduce issorted. So even in case of HADOOP the reduce function is applied onlyafter all the maps finish. But the reducers start in parallel just forshuffling.

Amar
On Mon, 3 Mar 2008, momina khan wrote:

hi all,


as seen in the video lectures from google their map reduce ensures
that all maps finish before reduces begin ... their reason for
ensuring this is that not all reduce functions are not necessarily
idempotent....
i just wanted to confirm whether hadoop too follows the same
philosophy ? do all maps end and then reduces begin or can they go on
in parallel cause that is the impression you get from the hadoop code!

cheers
momina

Re: do maps explicitly finish before reduces begin in hadoop

Reply via email to