If you're interested in region size balancing, please have a look at https://issues.apache.org/jira/browse/HBASE-13103 . Please provide feedback as we're hoping to have an early version available in 1.2.
Which reminds me, I owe Mikhail another review... On Thu, Jun 18, 2015 at 9:39 AM, Elliott Clark <[email protected]> wrote: > The balancer is not responsible fore region size decisions. The balancer is > only responsible for deciding which regionservers should host which > regions. > Splits are determined by data size of a region. See max store file size. > > On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong <[email protected]> wrote: > > > Hi, > > > > I've noticed there are two settings available when using the HBase > balancer > > (specifically the default stochastic balancer) > > > > hbase.master.balancer.stochastic.tableSkewCost > > > > hbase.master.loadbalance.bytable > > > > How do these two settings relate? The documentation indicates when using > > the stochastic balancer that 'bytable' should be set to false? > > > > Our deployment relies on very few, very large tables, and I've noticed > bad > > distribution when accessing some of the tables. E.g. there are 443 > regions > > for a single table, but when doing a MR job over a full scan of the > table, > > the first 426 regions scan quickly (minutes), but the remaining 17 > regions > > take significantly longer (hours) > > > > My expectation is to have the balancer equalize the size of the regions > for > > each table. > > > > Thanks! > > > > - Nasron > > >
