Re: HBase bulk loaded region can't be splitted

Yifeng Jiang Sat, 12 May 2012 19:00:49 -0700

Hi,

You need to create your table with pre-split regions.


$hbase org.apache.hadoop.hbase.util.RegionSplitter -c 10 -f region_name 
your_table
This command will pre-create 10 regions in your table using MD5 strings as 
region boundaries.

You can also customize the splitting algorithm. Please see
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.html

-Yifeng

On May 11, 2012, at 11:29 AM, Bruce Bian wrote:

> I use importtsv to load data as HFile
> 
> hadoop jar hbase-0.92.1.jar importtsv
> -Dimporttsv.bulk.output=/outputs/mytable.bulk
> -Dimporttsv.columns=HBASE_ROW_KEY,ns: -Dimporttsv.separator=, mytable
> /input
> 
> Then I use completebulkload to load those bulk data into my table
> 
> hadoop jar hbase-0.92.1.jar completebulkload /outputs/mytable.bulk mytable
> 
> However, the size of table is very huge (4.x GB). And it has only one
> region. Oddly, why doesn't HBase split it into multiple regions? It did
> exceed the size to split (256MB).
> 
> /hbase/mytable/71611409ea972a65b0876f953ad6377e/ns:
> 
> [image: enter image description here]
> 
> To split it, I try to use Split button on the Web UI of HBase. Sadly, it
> shows
> 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region
> mytable,,1334215360439.71611409ea972a65b0876f953ad6377e. not
> splittable because midkey=null
> 
> I have more data to load. About 300GB, no matter how many data I have
> loaded, it is still only one region. Also, it is still not splittable. Any
> idea?

Re: HBase bulk loaded region can't be splitted

Reply via email to