Re: Creating a Table using HFileOutputFormat

Renaud Delbru Fri, 24 Sep 2010 09:12:38 -0700

 On 24/09/10 16:55, Ted Yu wrote:

 From TotalOrderPartitioner:
       K[] splitPoints = readPartitions(fs, partFile, keyClass, conf);
       if (splitPoints.length != job.getNumReduceTasks() - 1) {
Partition list can be empty if you use 1 reducer.


But this is not what you want I guess.

Yes, this is not what we want since we want to create x regions.

But, we just found that there is a tool, InputSampler, in the hadooplibrary for this task. It will sample an arbitrary dataset, and createthe partition splits. We will try first this approach. My guess is that,even if these partitions are an approximation, it should be ok forhbase. The size of the regions will be not totally identical, but itshould not be a problem since the larger regions will be the first onessplit into smaller regions by hbase. Can somebody confirm this assumption ?

--
Renaud Delbru

Re: Creating a Table using HFileOutputFormat

Reply via email to