We write the split keys of a table into partitions file which is read by 
TotalOrderedPartitioner to distribute the keys.

See HFileOutputFormat2#configurePartitioner
{code}
  /**
   * Configure <code>job</code> with a TotalOrderPartitioner, partitioning 
against
   * <code>splitPoints</code>. Cleans up the partitions file after job exists.
   */
  static void configurePartitioner(Job job, List<ImmutableBytesWritable> 
splitPoints)
      throws IOException {

    // create the partitions file
    FileSystem fs = FileSystem.get(job.getConfiguration());
    Path partitionsPath = new Path("/tmp", "partitions_" + UUID.randomUUID());
    fs.makeQualified(partitionsPath);
    fs.deleteOnExit(partitionsPath);
    writePartitions(job.getConfiguration(), partitionsPath, splitPoints);

    // configure job to use it
    job.setPartitionerClass(TotalOrderPartitioner.class);
    TotalOrderPartitioner.setPartitionFile(job.getConfiguration(), 
partitionsPath);
  }
{code}

Thanks,
Rajeshbabu.

____________________
From: divye sheth [[email protected]]
Sent: Tuesday, March 25, 2014 4:07 PM
To: [email protected]
Subject: Bulk Loading with Presplits

Hi,

I am having a table with presplits, and am writing a utility to bulkLoad
StoreFiles into this table using the doBulkLoad functionality. The question
that comes to my mind is how does Hbase handle the distribution of the keys
when performing a bulkLoad?

How does it decide which key(row) goes to which partition?

Please help me understand this.
Hbase version 0.94.2

Thanks
Divye Sheth

Reply via email to