Nasron, Yeah, looks like you pretty much implemented pieces of logic being discussed in HBASE-13103 :) So that's interesting, thanks for telling us. Wondering, how did you estimate the number of desired regions? Something like N * number of region servers, or something different?
"I couldn't find a tool to show regions and their sizes, for a specific table, so ended up writing one." I think this something which we definitely need to have in shell/web (separate question of how to filter/page it if there are many thousands regions). But that's probably different discussion. I can open jira for that. On Fri, Jun 19, 2015 at 9:19 AM, Nick Dimiduk <[email protected]> wrote: > On Fri, Jun 19, 2015 at 7:45 AM, Nasron Cheong <[email protected]> wrote: > >> I couldn't find a tool to show regions and their sizes, for a specific >> table, so ended up writing one. >> > > Nasron, > > Would you mind having a look at the patch/RB on HBASE-13103? Does the API > pair RegionNormalizer/Normalization plan look like a reasonable harness for > you to hang your custom tool onto? Just like the balancer, it's designed to > be extensible with different normalization strategies. > > On Fri, Jun 19, 2015 at 3:47 AM, Dejan Menges <[email protected]> >> wrote: >> >> > Just have to say that hbase.master.loadbalance.bytable saved us after we >> > discovered it. In our case we had to set it manually to true, and then it >> > was easy to catch hot spotting on unusually large regions and handle it. >> > >> > Btw +1 for HBASE-13013, had to say it, something that makes me starting >> > upgrading our HDP stack on Monday morning. >> > >> > On Thu, Jun 18, 2015 at 11:04 PM, Bryan Beaudreault < >> > [email protected]> wrote: >> > >> > > Just had to say, https://issues.apache.org/jira/browse/HBASE-13103 >> looks >> > > *AWESOME* >> > > >> > > On Thu, Jun 18, 2015 at 5:00 PM Mikhail Antonov <[email protected]> >> > > wrote: >> > > >> > > > Yeah, I could see 2 reasons for remaining few regions to take >> > > > unproportionally long time - 1) those regions are unproportionally >> > > > large (you should be able to quickly confirm it) and 2) they happened >> > > > to be hosted on really slow/overloaded machine(s). #1 seems far more >> > > > likely to me. >> > > > >> > > > And as Nick said, there's ongoing effort to provide exactly what >> > > > you've described - centralized periodic analysis of region sizes and >> > > > equalization as needed (somewhat complementary to balancing), and any >> > > > feedback (especially from folks experiencing real issues with unequal >> > > > region sizes) is much appreciated. >> > > > >> > > > -Mikhail >> > > > >> > > > On Thu, Jun 18, 2015 at 10:07 AM, Nick Dimiduk <[email protected]> >> > > wrote: >> > > > > If you're interested in region size balancing, please have a look >> at >> > > > > https://issues.apache.org/jira/browse/HBASE-13103 . Please provide >> > > > feedback >> > > > > as we're hoping to have an early version available in 1.2. >> > > > > >> > > > > Which reminds me, I owe Mikhail another review... >> > > > > >> > > > > On Thu, Jun 18, 2015 at 9:39 AM, Elliott Clark <[email protected]> >> > > > wrote: >> > > > > >> > > > >> The balancer is not responsible fore region size decisions. The >> > > > balancer is >> > > > >> only responsible for deciding which regionservers should host >> which >> > > > >> regions. >> > > > >> Splits are determined by data size of a region. See max store file >> > > size. >> > > > >> >> > > > >> On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong <[email protected]> >> > > > wrote: >> > > > >> >> > > > >> > Hi, >> > > > >> > >> > > > >> > I've noticed there are two settings available when using the >> HBase >> > > > >> balancer >> > > > >> > (specifically the default stochastic balancer) >> > > > >> > >> > > > >> > hbase.master.balancer.stochastic.tableSkewCost >> > > > >> > >> > > > >> > hbase.master.loadbalance.bytable >> > > > >> > >> > > > >> > How do these two settings relate? The documentation indicates >> when >> > > > using >> > > > >> > the stochastic balancer that 'bytable' should be set to false? >> > > > >> > >> > > > >> > Our deployment relies on very few, very large tables, and I've >> > > noticed >> > > > >> bad >> > > > >> > distribution when accessing some of the tables. E.g. there are >> 443 >> > > > >> regions >> > > > >> > for a single table, but when doing a MR job over a full scan of >> > the >> > > > >> table, >> > > > >> > the first 426 regions scan quickly (minutes), but the remaining >> 17 >> > > > >> regions >> > > > >> > take significantly longer (hours) >> > > > >> > >> > > > >> > My expectation is to have the balancer equalize the size of the >> > > > regions >> > > > >> for >> > > > >> > each table. >> > > > >> > >> > > > >> > Thanks! >> > > > >> > >> > > > >> > - Nasron >> > > > >> > >> > > > >> >> > > > >> > > > >> > > > >> > > > -- >> > > > Thanks, >> > > > Michael Antonov >> > > > >> > > >> > >> -- Thanks, Michael Antonov
