Jean-Marc: Take a look at HRegionPartitioner which is in both mapred and mapreduce packages:
* This is used to partition the output keys into groups of keys. * Keys are grouped according to the regions that currently exist * so that each reducer fills a single region so load is distributed. Cheers On Wed, Apr 10, 2013 at 6:54 AM, Jean-Marc Spaggiari < [email protected]> wrote: > Hi Nitin, > > You got my question correctly. > > However, I'm wondering how it's working when it's done into HBase. Do > we have defaults partionners so we have the same garantee that records > mapping to one key go to the same reducer. Or do we have to implement > this one our own. > > JM > > 2013/4/10 Nitin Pawar <[email protected]>: > > I hope i understood what you are asking is this . If not then pardon me > :) > > from the hadoop developer handbook few lines > > > > The*Partitioner* class determines which partition a given (key, value) > pair > > will go to. The default partitioner computes a hash value for the key and > > assigns the partition based on this result. It garantees that all the > > records mapping to one key go to same reducer > > > > You can write your custom partitioner as well > > here is the link : > > http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning > > > > > > > > > > On Wed, Apr 10, 2013 at 6:19 PM, Jean-Marc Spaggiari < > > [email protected]> wrote: > > > >> Hi, > >> > >> quick question. How are the data from the map tasks partitionned for > >> the reducers? > >> > >> If there is 1 reducer, it's easy, but if there is more, are all they > >> same keys garanteed to end on the same reducer? Or not necessary? If > >> they are not, how can we provide a partionning function? > >> > >> Thanks, > >> > >> JM > >> > > > > > > > > -- > > Nitin Pawar >
