hi ram, thanks for help, i just do a test for bucket cache, in product env, we will follow your suggestion
Sent from my iPhone > On 2014年11月25日, at 20:36, ramkrishna vasudevan > <[email protected]> wrote: > > Your write ingest is too high. You have to control that by first adding > more nodes and ensuring that you have a more distributed load. And also > try with the changing the hbase.hstore.blockingStoreFiles. > > Even changing the above value if your write ingest is so high such that if > it can reach this configured value again you can see blocking writes. > > Regards > RAm > > >> On Tue, Nov 25, 2014 at 2:20 PM, Qiang Tian <[email protected]> wrote: >> >> in your log: >> 2014-11-25 13:31:35,048 WARN [MemStoreFlusher.13] >> regionserver.MemStoreFlusher: Region >> usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875. has too >> many store files; delaying flush up to 90000ms >> >> please see my original reply...you can try increasing >> "hbase.hstore.blockingStoreFiles", also you have only 1 RS and you split to >> 100 regions....you can try 2 RS with 20 regions. >> >> >> >>> On Tue, Nov 25, 2014 at 3:42 PM, louis.hust <[email protected]> wrote: >>> >>> yes, the stack trace like below: >>> >>> 2014-11-25 13:35:40:946 4260 sec: 232700856 operations; 28173.18 current >>> ops/sec; [INSERT AverageLatency(us)=637.59] >>> 2014-11-25 13:35:50:946 4270 sec: 232700856 operations; 0 current >> ops/sec; >>> 14/11/25 13:35:59 INFO client.AsyncProcess: #14, table=usertable2, >>> attempt=10/35 failed 109 ops, last exception: >>> org.apache.hadoop.hbase.RegionTooBusyException: >>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, >> regionName=usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875., >>> server=l-hbase10.dba.cn1.qunar.com,60020,1416889404151, >>> memstoreSize=536886800, blockingMemStoreSize=536870912 >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) >>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) >>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) >>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) >>> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) >>> at >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) >>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) >>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) >>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) >>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) >>> at java.lang.Thread.run(Thread.java:744) >>> >>> Then i loopup the memstore size for user8289, is 512M. and now is still >>> 512M(15:40) >>> >>> The region server log is attached which maybe help. >>> >>> >>> >>> >>> >>> >>> On Nov 25, 2014, at 15:27, ramkrishna vasudevan < >>> [email protected]> wrote: >>> >>>> Are you getting any exceptions in the log? Do you have a stack trace >>> when >>>> it is blocked? >>>> >>>>> On Tue, Nov 25, 2014 at 12:30 PM, louis.hust <[email protected]> >>>> wrote: >>>> >>>>> hi,Ram >>>>> >>>>> After i modify the hbase.hstore.flusher.count, it just improve the >>> load, >>>>> but after one hour , the YCSB >>>>> load program is still blocked! Then I change >> hbase.hstore.flusher.count >>> to >>>>> 40, but it’s the same as 20, >>>>> >>>>> On Nov 25, 2014, at 14:47, ramkrishna vasudevan < >>>>> [email protected]> wrote: >>>>> >>>>>>>> hbase.hstore.flusher.count to 20 (default value is 2), and run the >>> YCSB >>>>>> to load data >>>>>> with 32 threads >>>>>> >>>>>> Apologies for the late reply. Your change of configuraton from 2 to >> 20 >>> is >>>>>> right in this case because you are data ingest rate is high I >> suppose. >>>>>> >>>>>> Thanks for the reply. >>>>>> >>>>>> Regards >>>>>> Ram >>>>>> >>>>>>> On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> hi, all >>>>>>> >>>>>>> I retest the YCSB load data, and here is a situation which may >> explain >>>>> the >>>>>>> load data blocked. >>>>>>> >>>>>>> I use too many threads to insert values, so the flush thread is not >>>>>>> effectively to handle all memstore, >>>>>>> and the user9099 memstore is queued at last, and waiting for flush >> too >>>>>>> long which blocks the YCSB request. >>>>>>> >>>>>>> Then I modify the configuration, set hbase.hstore.flusher.count to >> 20 >>>>>>> (default value is 2), and run the YCSB to load data >>>>>>> with 32 threads, it can run for 1 hour (with 2 threads just run for >>> less >>>>>>> than half 1 hour). >>>>>>> >>>>>>> >>>>>>>> On Nov 20, 2014, at 23:20, louis.hust <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi Ram, >>>>>>>> >>>>>>>> Thanks for your reply! >>>>>>>> >>>>>>>> I use YCSB workloadc to load data, and from the web request >> monitor i >>>>>>> can see that >>>>>>>> the write requests are distributed among all regions, so i think >> the >>>>>>> data get distributed, >>>>>>>> >>>>>>>> And there are 32 thread writing to the region server, may be the >>>>>>> concurrency and write rate is too high. >>>>>>>> The writes are blocked but the memstore do not get flushed, i want >> to >>>>>>> know why? >>>>>>>> >>>>>>>> The jvm heap is 64G and hbase.regionserver.global.memstore.size is >>>>>>> default(0.4) about 25.6G, >>>>>>>> and hbase.hregion.memstore.flush.size is default(132M), but the >>>>> blocked >>>>>>> memstore user9099 >>>>>>>> reach 512m and do not flush at all. >>>>>>>> >>>>>>>> other memstore related options: >>>>>>>> >>>>>>>> hbase.hregion.memstore.mslab.enabled=true >>>>>>>> hbase.regionserver.global.memstore.upperLimit=0.4 >>>>>>>> hbase.regionserver.global.memstore.lowerLimit=0.38 >>>>>>>> hbase.hregion.memstore.block.multiplier=4 >>>>>>>> >>>>>>>> >>>>>>>>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Check if the writes are going to that particular region and its >> rate >>>>> is >>>>>>> too high. Ensure that the data gets distributed among all regions. >>>>>>>>> What is the memstore size? >>>>>>>>> >>>>>>>>> If the rate of writes is very high then the flushing will get >> queued >>>>>>> and until the memstore gets flushed such that it goes down the >> global >>>>> upper >>>>>>> limit writes will be blocked. >>>>>>>>> >>>>>>>>> I don't have the code now to see the exact config related to >>> memstore. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Ram >>>>>>>>> >>>>>>>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <[email protected] >>> >>>>>>> wrote: >>>>>>>>> hi all, >>>>>>>>> >>>>>>>>> I build an HBASE test environment, with three PC server, with CHD >>>>> 5.1.0 >>>>>>>>> >>>>>>>>> pc1 pc2 pc3 >>>>>>>>> >>>>>>>>> pc1 and pc2 as HMASTER and hadoop namenode >>>>>>>>> pc3 as RegionServer and datanode >>>>>>>>> >>>>>>>>> Then I create user as following: >>>>>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i| >>>>>>> "user#{1000+i*(9999-1000)/100}"} } >>>>>>>>> Using YCSB for load data as following: >>>>>>>>> >>>>>>>>> ./bin/ycsb load hbase -P workloads/workloadc -p >>>>>>> columnfamily=family -p recordcount=1000000000 -p threadcount=32 >>> -s > >>>>>>> result/workloadc >>>>>>>>> >>>>>>>>> >>>>>>>>> But when after a while, the ycsb return with following error: >>>>>>>>> >>>>>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable, >>>>>>> attempt=35/35 failed 715 ops, last exception: >>>>>>> org.apache.hadoop.hbase.RegionTooBusyException: >>>>>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore >> limit, >> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9., >>>>>>> server=l-hbase10.dba.cn1,60020,1416451280772, >> memstoreSize=536897120, >>>>>>> blockingMemStoreSize=536870912 >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) >>>>>>>>> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) >>>>>>>>> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) >>>>>>>>> at >>>>>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) >>>>>>>>> at >>>>>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) >>>>>>>>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) >>>>>>>>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) >>>>>>>>> at >> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) >>>>>>>>> at java.lang.Thread.run(Thread.java:744) >>>>>>>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu Nov >>> 20 >>>>>>> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops. >>>>>>>>> >>>>>>>>> >>>>>>>>> It seems the user9099 region is too busy, so I lookup the memstore >>>>>>> metrics in web: >>>>>>>>> >>>>>>>>> >>>>>>>>> As you see, the user9099 is bigger than other region, I think it >> is >>>>>>> flushing, but after a while, it does not change to a small size and >>> YCSB >>>>>>> quit finally. >>>>>>>>> >>>>>>>>> But when i change the concurrency threads to 4, all is right. I >> want >>>>> to >>>>>>> know why? >>>>>>>>> >>>>>>>>> Any idea will be appreciated. >>
