Wouldn't major_compact trigger a split...if it really needs to split.... However if you want to presplit regions for your table you can use the regionsplitter utility as below:
$export HADOOP_CLASSPATH=`hbase classpath`; hbase org.apache.hadoop.hbase.util.RegionSplitter This will give you a usage.... sample is: hbase org.apache.hadoop.hbase.util.RegionSplitter -c 10 'mytable' -f ns On Fri, May 11, 2012 at 8:37 AM, Bruce Bian <[email protected]> wrote: > Yes, I understand that. > But after I complete the bulk load, shouldn't it trigger the region server > to split that region in order to meet the > *hbase*.*hregion*.*max*.*filesize > * criteria? > When I try to split the regions manually using the WebUI, nothing happened, > but instead a Region > mytable,,1334215360439.71611409ea972a65b0876f953ad6377e. > not splittable because midkey=null > message is found in the region server log. > > > On Fri, May 11, 2012 at 10:56 AM, Bryan Beaudreault < > [email protected]> wrote: > > > I haven't done bulk loads using the importtsv tool, but I imagine it > works > > similarly to the mapreduce bulk load tool we are provided. If so, the > > following stands. > > > > In order to do a bulk load you need to have a table ready to accept the > > data. The bulk load does not create regions, but only puts data into the > > right place based on existing regions. Since you only have 1 region to > > start with, it makes sense that they would all go to that one region. > You > > should find a way to calculate the regions that you want and create your > > table with pre-created regions. Then re-run the import. > > > > On Thu, May 10, 2012 at 10:50 PM, Bruce Bian <[email protected]> > > wrote: > > > > > I use importtsv to load data as HFile > > > > > > hadoop jar hbase-0.92.1.jar importtsv > > > -Dimporttsv.bulk.output=/outputs/mytable.bulk > > > -Dimporttsv.columns=HBASE_ROW_KEY,ns: -Dimporttsv.separator=, mytable > > > /input > > > > > > Then I use completebulkload to load those bulk data into my table > > > > > > hadoop jar hbase-0.92.1.jar completebulkload /outputs/mytable.bulk > > mytable > > > > > > However, the size of table is very huge (4.x GB). And it has only one > > > region. Oddly, why doesn't HBase split it into multiple regions? It did > > > exceed the size to split (256MB). > > > > > > /hbase/mytable/71611409ea972a65b0876f953ad6377e/ns: > > > > > > [image: enter image description here] > > > > > > To split it, I try to use Split button on the Web UI of HBase. Sadly, > it > > > shows > > > > > > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region > > > mytable,,1334215360439.71611409ea972a65b0876f953ad6377e. not > > > splittable because midkey=null > > > > > > I have more data to load. About 300GB, no matter how many data I have > > > loaded, it is still only one region. Also, it is still not splittable. > > Any > > > idea? > > > > > >
