If you plan pre-splitting regions, look at the classes exposed by RegionSplitter (http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.html).
Are you keys String representing hexadecimal values or are they really binary encoded ? (I mean, \xFF\x03 and not "F3" for example) On Wed, Aug 29, 2012 at 4:46 PM, Oleg Ruchovets <[email protected]> wrote: > Hi , > I have bulk loading job. > My job is for User data aggregation. > Before I run Bulk Loading aggregation I want to create regions > UserID looks like this : > > 943e2c6d66d732e06ab257903f240d27 > > > a0617cb2b964690a39b0d93e7fe2f021 > > > ac85b4dee6d8c8495d61201234dfb73e > > > b8416d5e0fe2a1228f042dffa8d291e2 > > > c422be9e75d28d9afe0f1f98f59cda92 > > > fe6b0ad1822455958586e240eb75c1d7 > > > 1790ee2ce4487d976cd9eddd036275d6 > > > 344c3de9449a9522d2a4de8bb9e81b02 > > > 4fcccd6790aec3056f897741b467d08c > > > 6b67dc1922e4fc0cd6fa31f64bd51ef3 > > > 87f1374e7c900a243450f5b5c3a2b2b9 > > > a4180db6a62f300cdecf77310f0010ac > > > > I have ~ 50.000.000 users. I run aggregation on daily basis and per day I > have ~ 30 regions. > So The objective is to create 30 regions with more or less equal > distributions. > > The question is : What is the best practice to verify start / end key for > regions in my use case? > > Thanks in advance > Oleg. -- AM
