By using the default partitioner the key "who" will go to the partitions with the same number and the reducer will collect all partition for a given number. Reducer 0 all partitions 0 and so on. But yes, you can mess up with the partitioner logic for good or evil, if that's your question. With great power, comes great responsibility.
Regards Bertrand Bertrand Dechoux On Fri, Mar 14, 2014 at 10:34 AM, Pratap M <[email protected]> wrote: > Hi, > > I understand that the mapper produces 1 partition per reducer. How does > the reducer know which partition to copy? Lets say there are 2 nodes > running mapper for word count program and there are 2 reducers configured. > If each map node produces 2 partitions, with the possibility of partitions > in both the nodes containing same word as key, how will the reducer work > correctly? > > For ex: > > If node 1 produces partition 1 and partition 2, and partition 1 contains a > key named "WHO". > > If node 2 produces partition 3 and partition 4, and partition 3 contains a > key named "WHO". > > If Partition 1 and Partition 4 went to reducer 1 (and remaining to reducer > 2), how does the reducer 1 compute the correct word count? > > If this is not a possibility, and partition 1 and 3 would be made to go to > reducer 1, how Hadoop does this? Does it make sure a given key-value pair > from different nodes always go to a same reducer? If so, how it does this? >
