After reducer has the data then it does sorting and merging at its end too.After merging (data of same key together), it passes the keys and the collection of values for each key to reducer 1 by 1 as you said. The sort and merge on reducer side do this merging of records of same key (after sorting).
Have a look at the diagram at http://hadoop-gyan.blogspot.in/2012/11/map-reduce-shuffle-and-sort.html (its not my blog). On Mon, Dec 22, 2014 at 10:52 AM, [email protected] <[email protected]> wrote: > Then what exactly happens after Reducer pulls all mapper output key/value > pairs from all the mapper nodes before reducer see the > <key,value1,value2..>? > > ------------------------------ > [email protected] > > > *From:* Susheel Kumar Gadalay <[email protected]> > *Date:* 2014-12-22 13:20 > *To:* user <[email protected]> > *Subject:* Re: Question about shuffle/merge/sort phrase > Sorry, typo > > It is the reducer which will pull the mapper o/p as soon as it completes. > > On 12/22/14, Susheel Kumar Gadalay <[email protected]> wrote: > > It is the mapper which will push the o/p to the respective reducer as > > soon as it completes. > > > > The no of reducers are known at the beginning itself. > > The mapper as it process the input split, generate the o/p of for each > > reducer (if the mapper o/p key is eligible for the reducer). > > The reducer will wait till the completion of all map tasks to start it > > processing. > > > > > > On 12/22/14, [email protected] <[email protected]> wrote: > >> Could some one help me on this question? thanks. > >> > >> > >> > >> [email protected] > >> > >> 发件人: Todd > >> 发送时间: 2014-12-21 21:59 > >> 收件人: [email protected] > >> 主题: Question about shuffle/merge/sort phrase > >> Hi, Hadoopers, > >> I got a question about shuffle/sort/merge phrase related.. > >> My understanding is that shuffle is used to transfer the mapper > >> output(key/value pairs) from mapper node to reducer node, and merge > >> phrase > >> is used to merge all the mapper output from all mapper nodes, and sort > >> phrase is used to sort the key/value pair by key, > >> Then my question, whose responsibility is it that brings each key with > >> all > >> its values together (The reducer's input is a key and an iterative > >> values). > >> > >> > >> Thanks. > >> > > > > -- Thanks and regards Sandeep Khurana
