We write the split keys of a table into partitions file which is read by
TotalOrderedPartitioner to distribute the keys.
See HFileOutputFormat2#configurePartitioner
{code}
/**
* Configure <code>job</code> with a TotalOrderPartitioner, partitioning
against
* <code>splitPoints</code>. Cleans up the partitions file after job exists.
*/
static void configurePartitioner(Job job, List<ImmutableBytesWritable>
splitPoints)
throws IOException {
// create the partitions file
FileSystem fs = FileSystem.get(job.getConfiguration());
Path partitionsPath = new Path("/tmp", "partitions_" + UUID.randomUUID());
fs.makeQualified(partitionsPath);
fs.deleteOnExit(partitionsPath);
writePartitions(job.getConfiguration(), partitionsPath, splitPoints);
// configure job to use it
job.setPartitionerClass(TotalOrderPartitioner.class);
TotalOrderPartitioner.setPartitionFile(job.getConfiguration(),
partitionsPath);
}
{code}
Thanks,
Rajeshbabu.
____________________
From: divye sheth [[email protected]]
Sent: Tuesday, March 25, 2014 4:07 PM
To: [email protected]
Subject: Bulk Loading with Presplits
Hi,
I am having a table with presplits, and am writing a utility to bulkLoad
StoreFiles into this table using the doBulkLoad functionality. The question
that comes to my mind is how does Hbase handle the distribution of the keys
when performing a bulkLoad?
How does it decide which key(row) goes to which partition?
Please help me understand this.
Hbase version 0.94.2
Thanks
Divye Sheth