Parth, The reducer process has 2 distinct steps 1. Shuffle 2. Reduce
In shuffle phase, the reducer 'r' does the following 1. copies the data generated by all the mappers for the reducer 'r' 2. sorts it After the shuffle phase the reduce phase starts. In this phase the reducer invokes the reduce() function for each [k,<v1,v2...>] pairs generated in the shuffle phase. Amar On 2/27/10 4:36 PM, "parth" <[email protected]> wrote: Hi, I am confused on a particular point about reducer. can anyone guide me about the same ? When mapper starts generating key value pairs, will it all be available in reducer i.e. after all mappers have exited? I mean for a key K will all values be grouped and available in reducer. Or Will the reducer run on a single key-value pair as it becomes available ? Second option seems high unrealistic. Thanks, Parth -- View this message in context: http://old.nabble.com/Availability-of-values-in-a-key-in-Reduce-stage-tp27727136p27727136.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
