Re: HBase bulk loaded region can't be splitted

Subir S Sat, 12 May 2012 01:28:16 -0700

Wouldn't major_compact trigger a split...if it really needs to split....

However if you want to presplit regions for your table you can use the
regionsplitter utility as below:


$export HADOOP_CLASSPATH=`hbase classpath`; hbase
org.apache.hadoop.hbase.util.RegionSplitter

This will give you a usage....

sample is: hbase org.apache.hadoop.hbase.util.RegionSplitter -c 10
'mytable' -f ns


On Fri, May 11, 2012 at 8:37 AM, Bruce Bian <[email protected]> wrote:

> Yes, I understand that.
> But after I complete the bulk load, shouldn't it trigger the region server
> to split that region in order to meet the
>  *hbase*.*hregion*.*max*.*filesize
> * criteria?
> When I try to split the regions manually using the WebUI, nothing happened,
> but instead a Region
> mytable,,1334215360439.71611409ea972a65b0876f953ad6377e.
> not splittable because midkey=null
> message is found in the region server log.
>
>
> On Fri, May 11, 2012 at 10:56 AM, Bryan Beaudreault <
> [email protected]> wrote:
>
> > I haven't done bulk loads using the importtsv tool, but I imagine it
> works
> > similarly to the mapreduce bulk load tool we are provided.  If so, the
> > following stands.
> >
> > In order to do a bulk load you need to have a table ready to accept the
> > data.  The bulk load does not create regions, but only puts data into the
> > right place based on existing regions.  Since you only have 1 region to
> > start with, it makes sense that they would all go to that one region.
>  You
> > should find a way to calculate the regions that you want and create your
> > table with pre-created regions.  Then re-run the import.
> >
> > On Thu, May 10, 2012 at 10:50 PM, Bruce Bian <[email protected]>
> > wrote:
> >
> > > I use importtsv to load data as HFile
> > >
> > > hadoop jar hbase-0.92.1.jar importtsv
> > > -Dimporttsv.bulk.output=/outputs/mytable.bulk
> > > -Dimporttsv.columns=HBASE_ROW_KEY,ns: -Dimporttsv.separator=, mytable
> > > /input
> > >
> > > Then I use completebulkload to load those bulk data into my table
> > >
> > > hadoop jar hbase-0.92.1.jar completebulkload /outputs/mytable.bulk
> > mytable
> > >
> > > However, the size of table is very huge (4.x GB). And it has only one
> > > region. Oddly, why doesn't HBase split it into multiple regions? It did
> > > exceed the size to split (256MB).
> > >
> > > /hbase/mytable/71611409ea972a65b0876f953ad6377e/ns:
> > >
> > > [image: enter image description here]
> > >
> > > To split it, I try to use Split button on the Web UI of HBase. Sadly,
> it
> > > shows
> > >
> > > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region
> > > mytable,,1334215360439.71611409ea972a65b0876f953ad6377e. not
> > > splittable because midkey=null
> > >
> > > I have more data to load. About 300GB, no matter how many data I have
> > > loaded, it is still only one region. Also, it is still not splittable.
> > Any
> > > idea?
> > >
> >
>

Re: HBase bulk loaded region can't be splitted

Reply via email to