I've done it. This is the code I used: https://gist.github.com/bbeaudreault/7567385
It comes from the hbase source, but is modified to actually work (the class provided in hbase is private and does not work out of the box). There is a readme at the bottom of the gist with my process. One important note though, I did this with a deep understanding (after hours of reading hbase code and doing tests on a test cluster) of how it all works. And even then I felt nervous to do it in prod. Hence why I went the snapshot/compact route. I would definitely test it on a test cluster and get some familiarity before getting close to a production table. That said, I've run this on 8-10 production tables a few months ago, reducing in size from 10-20x in some cases. On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <[email protected]> wrote: > Hello- > > We recently realized our region size is 1G and need to increase it to get > our region count under control. I've done some research on merging regions > and have come away confused. > > There is the ops handbook: > > http://hbase.apache.org/book/ops.regionmgt.html > > And then there is this horror story: > > http://metabroadcast.com/blog/so-you-broke-hbase > > Is there someone out there that has done a large scale (i.e. 10:1 > reduction on 10k's of regions) merge successfully on HBase 0.94? If so, > how did you do it? > > Thanks, > Ted > >
