The balancer is not responsible fore region size decisions. The balancer is only responsible for deciding which regionservers should host which regions. Splits are determined by data size of a region. See max store file size.
On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong <[email protected]> wrote: > Hi, > > I've noticed there are two settings available when using the HBase balancer > (specifically the default stochastic balancer) > > hbase.master.balancer.stochastic.tableSkewCost > > hbase.master.loadbalance.bytable > > How do these two settings relate? The documentation indicates when using > the stochastic balancer that 'bytable' should be set to false? > > Our deployment relies on very few, very large tables, and I've noticed bad > distribution when accessing some of the tables. E.g. there are 443 regions > for a single table, but when doing a MR job over a full scan of the table, > the first 426 regions scan quickly (minutes), but the remaining 17 regions > take significantly longer (hours) > > My expectation is to have the balancer equalize the size of the regions for > each table. > > Thanks! > > - Nasron >
