Actually there may be a simpler solution: http://pastebin.com/3KJ7Vxnc
We can check the ratio between online regions and total number of regions in IncreasingToUpperBoundRegionSplitPolicy#shouldSplit(). Only when the ratio gets over certain threshold, should splitting start. FYI On Thu, Mar 24, 2016 at 12:39 PM, Ted Yu <[email protected]> wrote: > Currently IncreasingToUpperBoundRegionSplitPolicy doesn't detect when the > master initialization finishes. > > There is also some missing piece where region server notifies the > completion of cluster initialization (by looking at RegionServerObserver). > > Cheers > > On Thu, Mar 24, 2016 at 3:50 AM, Bram Desoete <[email protected]> wrote: > >> >> >> >> Pedro Gandola <pedro.gandola@...> writes: >> >> > >> > Hi Ted, >> > >> > Thanks, >> > I think I got the problem, I'm using >> *IncreasingToUpperBoundRegionSplitPolicy >> > (default)* instead *ConstantSizeRegionSplitPolicy* which in my use case >> is >> > what I want. >> > >> > Cheers >> > Pedro >> > >> > On Mon, Feb 15, 2016 at 5:22 PM, Ted Yu <yuzhihong@...> wrote: >> > >> > > Can you pastebin region server log snippet around the time when the >> split >> > > happened ? >> > > >> > > Was the split on data table or index table ? >> > > >> > > Thanks >> > > >> > > > On Feb 15, 2016, at 10:22 AM, Pedro Gandola <pedro.gandola@...> >> > > wrote: >> > > > >> > > > Hi, >> > > > >> > > > I have a cluster using *HBase 1.1.2* where I have a table and a >> local >> > > index >> > > > (using *Apache Phoenix 4.6*) in total both tables have *300 regions* >> > > > (aprox: *18 regions per server*), my* >> hbase.hregion.max.filesize=30GB >> > > *and >> > > > my region sizes are now *~4.5GB compressed (~7GB uncompressed)*. >> However >> > > > each time I restart a RS sometimes a region gets split. This is >> > > unexpected >> > > > because my key space is uniform (using MD5) and if the problem was >> my >> > > > *region.size >> > > >> * *hbase.hregion.max.filesize *I would expect to have all the >> regions or >> > > > almost all splitting but this only happens when I restart a RS and >> it >> > > > happens only for 1 or 2 regions. >> > > > >> > > > What are the different scenarios where a region can split? >> > > > >> > > > What are the right steps to restart a region server in order to >> avoid >> > > these >> > > > unexpected splits? >> > > > >> > > > Thank you, >> > > > Cheers >> > > > Pedro >> > > >> > >> >> >> >> Thanks Pedro for giving your solution. >> >> i see the same issue during Hbase restarts. unexpected region splits. >> i believe it is because the *IncreasingToUpperBoundRegionSplitPolicy* is >> basing >> his calculation on the amount of ONLINE regions. >> but while the RS is starting only a couple of regions are online YET. >> so the policy things it would be no problem to add another region >> since 'there are only a few'. >> (while there are actually already are 330 for that RS for that phoenix >> table... >> yes i know i need to merge regions. >> but this problem got out of hand unnoticed for some time now here) >> >> could HBase block split region decision until it is fully up and running? >> >> Hbase 1.0.0 logs. (check mainly the last line) >> >> Mar 24, 11:06:41.494 AM INFO >> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher >> Flushed, sequenceid=69436099, memsize=303.3 K, hasBloomFilter=true, into >> tmp >> file >> >> hdfs://ns/hbase/data/default/CUSTOMER/60af2857a7980ce4f1ac602dd83e05a6/.tmp/ >> 0fd4988f24f24d5d9887c542182efccc >> Mar 24, 11:06:41.529 AM INFO >> org.apache.hadoop.hbase.regionserver.HStore >> Added hdfs://-ns/hbase/data/default/CUSTOMER/ >> ff4ecd56e6b06f228404f05f171f8282/0/1d05cf9cac4c46008e47e3578e7a18d6, >> entries=235, sequenceid=22828972, filesize=5.5 K >> Mar 24, 11:06:41.561 AM INFO >> org.apache.hadoop.hbase.regionserver.HStore >> Completed compaction of 3 (all) file(s) in s of CUSTOMER,\x0A0+\xF6\ >> xD8,1457121856469.183f6134683e0213ccb15558a56f7c02. >> into 730489295b8c42afaec4a3b8bc38c915(size=1.4 M), >> total size for store is 1.4 M. This selection was in queue for >> 0sec, and took 0sec to execute. >> Mar 24, 11:06:41.561 AM INFO >> org.apache.hadoop.hbase.regionserver.CompactSplitThread >> Completed compaction: Request = regionName=CUSTOMER, >> \x0A0+\xF6\xD8,1457121856469.183f6134683e0213ccb15558a56f7c02., >> storeName=s, fileCount=3, fileSize=1.7 M, priority=7, >> time=1456532583179472; >> duration=0sec >> Mar 24, 11:06:41.562 AM DEBUG >> >> org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy >> ShouldSplit because IB size=3269370636, sizeToCheck=2147483648, >> regionsWithCommonTable=2 >> >> i will also revert back to the ConstantSizeRegionSplitPolicy >> >> Regards, >> >> >> >> >
