On Wed, Apr 10, 2013 at 12:01 PM, Jean-Marc Spaggiari < [email protected]> wrote:
> Hi Greame, > > No. The reducer will simply write on the table the same way you are doing a > regular Put. If a split is required because of the size, then the region > will be split, but at the end, there will not necessary be any region > split. > > In the usecase described below, all the 600 lines will "simply" go into the > only region in the table and no split will occur. > > The goal is to partition the data for the reducer only. Not in the table. > Then just use the default partitioner? The suggestion that you use HTablePartitioner seems inappropriate to your task. See the sink doc here: http://hadoop.apache.org/docs/r2.0.3-alpha/api/org/apache/hadoop/mapreduce/lib/partition/HashPartitioner.html St.Ack
