Re: Availability of values in a key in Reduce stage

Amar Kamat Sat, 27 Feb 2010 08:57:41 -0800

Parth,
The reducer process has 2 distinct steps

 1.  Shuffle
 2.  Reduce


In shuffle phase, the reducer 'r' does the following

 1.  copies the data generated by all the mappers for the reducer 'r'
 2.  sorts it

After the shuffle phase the reduce phase starts.  In this phase the reducer  
invokes the reduce() function for each [k,<v1,v2...>] pairs generated in the 
shuffle phase.

Amar


On 2/27/10 4:36 PM, "parth" <[email protected]> wrote:



Hi,

I am confused on a particular point about reducer. can anyone guide me about
the same ?

When mapper starts generating key value pairs, will it all be available in
reducer i.e. after all mappers have exited?  I mean for a key K will all
values be grouped and available in reducer. Or Will the reducer run on a
single key-value pair as it becomes available ?
Second option seems high unrealistic.

Thanks,
Parth
--
View this message in context: 
http://old.nabble.com/Availability-of-values-in-a-key-in-Reduce-stage-tp27727136p27727136.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Availability of values in a key in Reduce stage

Reply via email to