Hi Mikhail, Something like N * number of region servers, or something different?
It's pretty much this, but additionally also trying to ensure that in best case we can fit the whole column in memory per region server. Another restriction is that since # MR map tasks == # of regions, keeping the number of regions reasonable is another restriction, as other MR tasks may need the slots. Our use case is really time based data, so as newer data comes in it might be better to relegate older data to larger regions? Not sure if that's something we should consider. I think this something which we definitely need to have in shell/web > (separate question of how to filter/page it if there are many > thousands regions). But that's probably different discussion. I can > open jira for that. While you're at it, we use long row keys in order to take advantage of fast start/stop filtering with Scanners, but it makes the current UI listing of region info unreadable. Another useful piece of info is some indication of a region's locality, request hits on the region, bloom filter hits, etc. I'm sure theres all kinds of things that people want. :) I was trying to get Hannibal working but it has some issues running on our cluster, in order to get more visibility. I'm sure there are more ideas there as well. Thanks! - Nasron On Fri, Jun 19, 2015 at 3:16 PM, Mikhail Antonov <[email protected]> wrote: > Nasron, > > Yeah, looks like you pretty much implemented pieces of logic being > discussed in HBASE-13103 :) So that's interesting, thanks for telling > us. Wondering, how did you estimate the number of desired regions? > Something like N * number of region servers, or something different? > > "I couldn't find a tool to show regions and their sizes, for a specific > table, so ended up writing one." > > I think this something which we definitely need to have in shell/web > (separate question of how to filter/page it if there are many > thousands regions). But that's probably different discussion. I can > open jira for that. > > On Fri, Jun 19, 2015 at 9:19 AM, Nick Dimiduk <[email protected]> wrote: > > On Fri, Jun 19, 2015 at 7:45 AM, Nasron Cheong <[email protected]> wrote: > > > >> I couldn't find a tool to show regions and their sizes, for a specific > >> table, so ended up writing one. > >> > > > > Nasron, > > > > Would you mind having a look at the patch/RB on HBASE-13103? Does the API > > pair RegionNormalizer/Normalization plan look like a reasonable harness > for > > you to hang your custom tool onto? Just like the balancer, it's designed > to > > be extensible with different normalization strategies. > > > > On Fri, Jun 19, 2015 at 3:47 AM, Dejan Menges <[email protected]> > >> wrote: > >> > >> > Just have to say that hbase.master.loadbalance.bytable saved us after > we > >> > discovered it. In our case we had to set it manually to true, and > then it > >> > was easy to catch hot spotting on unusually large regions and handle > it. > >> > > >> > Btw +1 for HBASE-13013, had to say it, something that makes me > starting > >> > upgrading our HDP stack on Monday morning. > >> > > >> > On Thu, Jun 18, 2015 at 11:04 PM, Bryan Beaudreault < > >> > [email protected]> wrote: > >> > > >> > > Just had to say, https://issues.apache.org/jira/browse/HBASE-13103 > >> looks > >> > > *AWESOME* > >> > > > >> > > On Thu, Jun 18, 2015 at 5:00 PM Mikhail Antonov < > [email protected]> > >> > > wrote: > >> > > > >> > > > Yeah, I could see 2 reasons for remaining few regions to take > >> > > > unproportionally long time - 1) those regions are unproportionally > >> > > > large (you should be able to quickly confirm it) and 2) they > happened > >> > > > to be hosted on really slow/overloaded machine(s). #1 seems far > more > >> > > > likely to me. > >> > > > > >> > > > And as Nick said, there's ongoing effort to provide exactly what > >> > > > you've described - centralized periodic analysis of region sizes > and > >> > > > equalization as needed (somewhat complementary to balancing), and > any > >> > > > feedback (especially from folks experiencing real issues with > unequal > >> > > > region sizes) is much appreciated. > >> > > > > >> > > > -Mikhail > >> > > > > >> > > > On Thu, Jun 18, 2015 at 10:07 AM, Nick Dimiduk < > [email protected]> > >> > > wrote: > >> > > > > If you're interested in region size balancing, please have a > look > >> at > >> > > > > https://issues.apache.org/jira/browse/HBASE-13103 . Please > provide > >> > > > feedback > >> > > > > as we're hoping to have an early version available in 1.2. > >> > > > > > >> > > > > Which reminds me, I owe Mikhail another review... > >> > > > > > >> > > > > On Thu, Jun 18, 2015 at 9:39 AM, Elliott Clark < > [email protected]> > >> > > > wrote: > >> > > > > > >> > > > >> The balancer is not responsible fore region size decisions. The > >> > > > balancer is > >> > > > >> only responsible for deciding which regionservers should host > >> which > >> > > > >> regions. > >> > > > >> Splits are determined by data size of a region. See max store > file > >> > > size. > >> > > > >> > >> > > > >> On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong < > [email protected]> > >> > > > wrote: > >> > > > >> > >> > > > >> > Hi, > >> > > > >> > > >> > > > >> > I've noticed there are two settings available when using the > >> HBase > >> > > > >> balancer > >> > > > >> > (specifically the default stochastic balancer) > >> > > > >> > > >> > > > >> > hbase.master.balancer.stochastic.tableSkewCost > >> > > > >> > > >> > > > >> > hbase.master.loadbalance.bytable > >> > > > >> > > >> > > > >> > How do these two settings relate? The documentation indicates > >> when > >> > > > using > >> > > > >> > the stochastic balancer that 'bytable' should be set to > false? > >> > > > >> > > >> > > > >> > Our deployment relies on very few, very large tables, and > I've > >> > > noticed > >> > > > >> bad > >> > > > >> > distribution when accessing some of the tables. E.g. there > are > >> 443 > >> > > > >> regions > >> > > > >> > for a single table, but when doing a MR job over a full scan > of > >> > the > >> > > > >> table, > >> > > > >> > the first 426 regions scan quickly (minutes), but the > remaining > >> 17 > >> > > > >> regions > >> > > > >> > take significantly longer (hours) > >> > > > >> > > >> > > > >> > My expectation is to have the balancer equalize the size of > the > >> > > > regions > >> > > > >> for > >> > > > >> > each table. > >> > > > >> > > >> > > > >> > Thanks! > >> > > > >> > > >> > > > >> > - Nasron > >> > > > >> > > >> > > > >> > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Thanks, > >> > > > Michael Antonov > >> > > > > >> > > > >> > > >> > > > > -- > Thanks, > Michael Antonov >
