Are you getting any exceptions in the log? Do you have a stack trace when it is blocked?
On Tue, Nov 25, 2014 at 12:30 PM, louis.hust <[email protected]> wrote: > hi,Ram > > After i modify the hbase.hstore.flusher.count, it just improve the load, > but after one hour , the YCSB > load program is still blocked! Then I change hbase.hstore.flusher.count to > 40, but it’s the same as 20, > > On Nov 25, 2014, at 14:47, ramkrishna vasudevan < > [email protected]> wrote: > > >>> hbase.hstore.flusher.count to 20 (default value is 2), and run the YCSB > > to load data > > with 32 threads > > > > Apologies for the late reply. Your change of configuraton from 2 to 20 is > > right in this case because you are data ingest rate is high I suppose. > > > > Thanks for the reply. > > > > Regards > > Ram > > > > On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <[email protected]> > wrote: > > > >> hi, all > >> > >> I retest the YCSB load data, and here is a situation which may explain > the > >> load data blocked. > >> > >> I use too many threads to insert values, so the flush thread is not > >> effectively to handle all memstore, > >> and the user9099 memstore is queued at last, and waiting for flush too > >> long which blocks the YCSB request. > >> > >> Then I modify the configuration, set hbase.hstore.flusher.count to 20 > >> (default value is 2), and run the YCSB to load data > >> with 32 threads, it can run for 1 hour (with 2 threads just run for less > >> than half 1 hour). > >> > >> > >> On Nov 20, 2014, at 23:20, louis.hust <[email protected]> wrote: > >> > >>> Hi Ram, > >>> > >>> Thanks for your reply! > >>> > >>> I use YCSB workloadc to load data, and from the web request monitor i > >> can see that > >>> the write requests are distributed among all regions, so i think the > >> data get distributed, > >>> > >>> And there are 32 thread writing to the region server, may be the > >> concurrency and write rate is too high. > >>> The writes are blocked but the memstore do not get flushed, i want to > >> know why? > >>> > >>> The jvm heap is 64G and hbase.regionserver.global.memstore.size is > >> default(0.4) about 25.6G, > >>> and hbase.hregion.memstore.flush.size is default(132M), but the > blocked > >> memstore user9099 > >>> reach 512m and do not flush at all. > >>> > >>> other memstore related options: > >>> > >>> hbase.hregion.memstore.mslab.enabled=true > >>> hbase.regionserver.global.memstore.upperLimit=0.4 > >>> hbase.regionserver.global.memstore.lowerLimit=0.38 > >>> hbase.hregion.memstore.block.multiplier=4 > >>> > >>> > >>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan < > >> [email protected]> wrote: > >>> > >>>> Check if the writes are going to that particular region and its rate > is > >> too high. Ensure that the data gets distributed among all regions. > >>>> What is the memstore size? > >>>> > >>>> If the rate of writes is very high then the flushing will get queued > >> and until the memstore gets flushed such that it goes down the global > upper > >> limit writes will be blocked. > >>>> > >>>> I don't have the code now to see the exact config related to memstore. > >>>> > >>>> Regards > >>>> Ram > >>>> > >>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <[email protected]> > >> wrote: > >>>> hi all, > >>>> > >>>> I build an HBASE test environment, with three PC server, with CHD > 5.1.0 > >>>> > >>>> pc1 pc2 pc3 > >>>> > >>>> pc1 and pc2 as HMASTER and hadoop namenode > >>>> pc3 as RegionServer and datanode > >>>> > >>>> Then I create user as following: > >>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i| > >> "user#{1000+i*(9999-1000)/100}"} } > >>>> Using YCSB for load data as following: > >>>> > >>>> ./bin/ycsb load hbase -P workloads/workloadc -p > >> columnfamily=family -p recordcount=1000000000 -p threadcount=32 -s > > >> result/workloadc > >>>> > >>>> > >>>> But when after a while, the ycsb return with following error: > >>>> > >>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable, > >> attempt=35/35 failed 715 ops, last exception: > >> org.apache.hadoop.hbase.RegionTooBusyException: > >> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, > >> > regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9., > >> server=l-hbase10.dba.cn1,60020,1416451280772, memstoreSize=536897120, > >> blockingMemStoreSize=536870912 > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469) > >>>> at > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359) > >>>> at > >> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503) > >>>> at > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) > >>>> at > >> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) > >>>> at > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) > >>>> at > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) > >>>> at > >> > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) > >>>> at java.lang.Thread.run(Thread.java:744) > >>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu Nov 20 > >> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops. > >>>> > >>>> > >>>> It seems the user9099 region is too busy, so I lookup the memstore > >> metrics in web: > >>>> > >>>> > >>>> As you see, the user9099 is bigger than other region, I think it is > >> flushing, but after a while, it does not change to a small size and YCSB > >> quit finally. > >>>> > >>>> But when i change the concurrency threads to 4, all is right. I want > to > >> know why? > >>>> > >>>> Any idea will be appreciated. > >>>> > >>>> > >>>> > >>> > >> > >> > >
