in your log: 2014-11-25 13:31:35,048 WARN [MemStoreFlusher.13] regionserver.MemStoreFlusher: Region usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875. has too many store files; delaying flush up to 90000ms
please see my original reply...you can try increasing "hbase.hstore.blockingStoreFiles", also you have only 1 RS and you split to 100 regions....you can try 2 RS with 20 regions. On Tue, Nov 25, 2014 at 3:42 PM, louis.hust <[email protected]> wrote: > yes, the stack trace like below: > > 2014-11-25 13:35:40:946 4260 sec: 232700856 operations; 28173.18 current > ops/sec; [INSERT AverageLatency(us)=637.59] > 2014-11-25 13:35:50:946 4270 sec: 232700856 operations; 0 current ops/sec; > 14/11/25 13:35:59 INFO client.AsyncProcess: #14, table=usertable2, > attempt=10/35 failed 109 ops, last exception: > org.apache.hadoop.hbase.RegionTooBusyException: > org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, > regionName=usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875., > server=l-hbase10.dba.cn1.qunar.com,60020,1416889404151, > memstoreSize=536886800, blockingMemStoreSize=536870912 > at > org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > at java.lang.Thread.run(Thread.java:744) > > Then i loopup the memstore size for user8289, is 512M. and now is still > 512M(15:40) > > The region server log is attached which maybe help. > > > > > > > On Nov 25, 2014, at 15:27, ramkrishna vasudevan < > [email protected]> wrote: > > > Are you getting any exceptions in the log? Do you have a stack trace > when > > it is blocked? > > > > On Tue, Nov 25, 2014 at 12:30 PM, louis.hust <[email protected]> > wrote: > > > >> hi,Ram > >> > >> After i modify the hbase.hstore.flusher.count, it just improve the > load, > >> but after one hour , the YCSB > >> load program is still blocked! Then I change hbase.hstore.flusher.count > to > >> 40, but it’s the same as 20, > >> > >> On Nov 25, 2014, at 14:47, ramkrishna vasudevan < > >> [email protected]> wrote: > >> > >>>>> hbase.hstore.flusher.count to 20 (default value is 2), and run the > YCSB > >>> to load data > >>> with 32 threads > >>> > >>> Apologies for the late reply. Your change of configuraton from 2 to 20 > is > >>> right in this case because you are data ingest rate is high I suppose. > >>> > >>> Thanks for the reply. > >>> > >>> Regards > >>> Ram > >>> > >>> On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <[email protected]> > >> wrote: > >>> > >>>> hi, all > >>>> > >>>> I retest the YCSB load data, and here is a situation which may explain > >> the > >>>> load data blocked. > >>>> > >>>> I use too many threads to insert values, so the flush thread is not > >>>> effectively to handle all memstore, > >>>> and the user9099 memstore is queued at last, and waiting for flush too > >>>> long which blocks the YCSB request. > >>>> > >>>> Then I modify the configuration, set hbase.hstore.flusher.count to 20 > >>>> (default value is 2), and run the YCSB to load data > >>>> with 32 threads, it can run for 1 hour (with 2 threads just run for > less > >>>> than half 1 hour). > >>>> > >>>> > >>>> On Nov 20, 2014, at 23:20, louis.hust <[email protected]> wrote: > >>>> > >>>>> Hi Ram, > >>>>> > >>>>> Thanks for your reply! > >>>>> > >>>>> I use YCSB workloadc to load data, and from the web request monitor i > >>>> can see that > >>>>> the write requests are distributed among all regions, so i think the > >>>> data get distributed, > >>>>> > >>>>> And there are 32 thread writing to the region server, may be the > >>>> concurrency and write rate is too high. > >>>>> The writes are blocked but the memstore do not get flushed, i want to > >>>> know why? > >>>>> > >>>>> The jvm heap is 64G and hbase.regionserver.global.memstore.size is > >>>> default(0.4) about 25.6G, > >>>>> and hbase.hregion.memstore.flush.size is default(132M), but the > >> blocked > >>>> memstore user9099 > >>>>> reach 512m and do not flush at all. > >>>>> > >>>>> other memstore related options: > >>>>> > >>>>> hbase.hregion.memstore.mslab.enabled=true > >>>>> hbase.regionserver.global.memstore.upperLimit=0.4 > >>>>> hbase.regionserver.global.memstore.lowerLimit=0.38 > >>>>> hbase.hregion.memstore.block.multiplier=4 > >>>>> > >>>>> > >>>>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan < > >>>> [email protected]> wrote: > >>>>> > >>>>>> Check if the writes are going to that particular region and its rate > >> is > >>>> too high. Ensure that the data gets distributed among all regions. > >>>>>> What is the memstore size? > >>>>>> > >>>>>> If the rate of writes is very high then the flushing will get queued > >>>> and until the memstore gets flushed such that it goes down the global > >> upper > >>>> limit writes will be blocked. > >>>>>> > >>>>>> I don't have the code now to see the exact config related to > memstore. > >>>>>> > >>>>>> Regards > >>>>>> Ram > >>>>>> > >>>>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <[email protected]> > >>>> wrote: > >>>>>> hi all, > >>>>>> > >>>>>> I build an HBASE test environment, with three PC server, with CHD > >> 5.1.0 > >>>>>> > >>>>>> pc1 pc2 pc3 > >>>>>> > >>>>>> pc1 and pc2 as HMASTER and hadoop namenode > >>>>>> pc3 as RegionServer and datanode > >>>>>> > >>>>>> Then I create user as following: > >>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i| > >>>> "user#{1000+i*(9999-1000)/100}"} } > >>>>>> Using YCSB for load data as following: > >>>>>> > >>>>>> ./bin/ycsb load hbase -P workloads/workloadc -p > >>>> columnfamily=family -p recordcount=1000000000 -p threadcount=32 > -s > > >>>> result/workloadc > >>>>>> > >>>>>> > >>>>>> But when after a while, the ycsb return with following error: > >>>>>> > >>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable, > >>>> attempt=35/35 failed 715 ops, last exception: > >>>> org.apache.hadoop.hbase.RegionTooBusyException: > >>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, > >>>> > >> > regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9., > >>>> server=l-hbase10.dba.cn1,60020,1416451280772, memstoreSize=536897120, > >>>> blockingMemStoreSize=536870912 > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) > >>>>>> at > >>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) > >>>>>> at > >>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > >>>>>> at > >>>> > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > >>>>>> at java.lang.Thread.run(Thread.java:744) > >>>>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu Nov > 20 > >>>> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops. > >>>>>> > >>>>>> > >>>>>> It seems the user9099 region is too busy, so I lookup the memstore > >>>> metrics in web: > >>>>>> > >>>>>> > >>>>>> As you see, the user9099 is bigger than other region, I think it is > >>>> flushing, but after a while, it does not change to a small size and > YCSB > >>>> quit finally. > >>>>>> > >>>>>> But when i change the concurrency threads to 4, all is right. I want > >> to > >>>> know why? > >>>>>> > >>>>>> Any idea will be appreciated. > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >> > >> > > >
