Re: YCSB load failed because hbase region too busy

ramkrishna vasudevan Mon, 24 Nov 2014 23:28:41 -0800

Are you getting any exceptions in the log?  Do you have a stack trace when
it is blocked?


On Tue, Nov 25, 2014 at 12:30 PM, louis.hust <[email protected]> wrote:

> hi，Ram
>
> After i modify the  hbase.hstore.flusher.count, it just improve the load,
> but after one hour , the YCSB
> load program is still blocked! Then I change hbase.hstore.flusher.count to
> 40, but it’s the same as 20,
>
> On Nov 25, 2014, at 14:47, ramkrishna vasudevan <
> [email protected]> wrote:
>
> >>> hbase.hstore.flusher.count to 20 (default value is 2), and run the YCSB
> > to load data
> > with 32 threads
> >
> > Apologies for the late reply. Your change of configuraton from 2 to 20 is
> > right in this case because you are data ingest rate is high I suppose.
> >
> > Thanks for the reply.
> >
> > Regards
> > Ram
> >
> > On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <[email protected]>
> wrote:
> >
> >> hi, all
> >>
> >> I retest the YCSB load data, and here is a situation which may explain
> the
> >> load data blocked.
> >>
> >> I use too many threads to insert values, so the flush thread is not
> >> effectively to handle all memstore,
> >> and the user9099 memstore is queued at last, and waiting for flush too
> >> long which blocks the YCSB request.
> >>
> >> Then I modify the configuration, set hbase.hstore.flusher.count to 20
> >> (default value is 2), and run the YCSB to load data
> >> with 32 threads, it can run for 1 hour (with 2 threads just run for less
> >> than half 1 hour).
> >>
> >>
> >> On Nov 20, 2014, at 23:20, louis.hust <[email protected]> wrote:
> >>
> >>> Hi Ram,
> >>>
> >>> Thanks for your reply!
> >>>
> >>> I use YCSB workloadc to load data, and from the web request monitor i
> >> can see that
> >>> the write requests are distributed among all regions, so i think the
> >> data get distributed,
> >>>
> >>> And there are 32 thread writing to the region server, may be the
> >> concurrency and write rate is too high.
> >>> The writes are blocked but the memstore do not get flushed, i want to
> >> know why?
> >>>
> >>> The jvm heap is 64G and hbase.regionserver.global.memstore.size is
> >> default(0.4) about 25.6G,
> >>> and hbase.hregion.memstore.flush.size is default(132M),  but the
> blocked
> >> memstore user9099
> >>> reach 512m and do not flush at all.
> >>>
> >>> other memstore related options:
> >>>
> >>> hbase.hregion.memstore.mslab.enabled=true
> >>> hbase.regionserver.global.memstore.upperLimit=0.4
> >>> hbase.regionserver.global.memstore.lowerLimit=0.38
> >>> hbase.hregion.memstore.block.multiplier=4
> >>>
> >>>
> >>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan <
> >> [email protected]> wrote:
> >>>
> >>>> Check if the writes are going to that particular region and its rate
> is
> >> too high.  Ensure that the data gets distributed among all regions.
> >>>> What is the memstore size?
> >>>>
> >>>> If the rate of writes is very high then the flushing will get queued
> >> and until the memstore gets flushed such that it goes down the global
> upper
> >> limit writes will be blocked.
> >>>>
> >>>> I don't have the code now to see the exact config related to memstore.
> >>>>
> >>>> Regards
> >>>> Ram
> >>>>
> >>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <[email protected]>
> >> wrote:
> >>>> hi all,
> >>>>
> >>>> I build an HBASE test environment, with three PC server, with CHD
> 5.1.0
> >>>>
> >>>> pc1 pc2 pc3
> >>>>
> >>>> pc1 and pc2 as HMASTER and hadoop namenode
> >>>> pc3 as RegionServer and datanode
> >>>>
> >>>> Then I create user as following:
> >>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
> >> "user#{1000+i*(9999-1000)/100}"} }
> >>>> Using YCSB for load data as following:
> >>>>
> >>>> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
> >> columnfamily=family -p recordcount=1000000000   -p threadcount=32  -s  >
> >> result/workloadc
> >>>>
> >>>>
> >>>> But when after a while, the ycsb return with following error:
> >>>>
> >>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
> >> attempt=35/35 failed 715 ops, last exception:
> >> org.apache.hadoop.hbase.RegionTooBusyException:
> >> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
> >>
> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
> >> server=l-hbase10.dba.cn1,60020,1416451280772, memstoreSize=536897120,
> >> blockingMemStoreSize=536870912
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
> >>>>        at
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
> >>>>        at
> >>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
> >>>>        at
> >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
> >>>>        at
> >> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
> >>>>        at
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
> >>>>        at
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
> >>>>        at
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
> >>>>        at java.lang.Thread.run(Thread.java:744)
> >>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu Nov 20
> >> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
> >>>>
> >>>>
> >>>> It seems the user9099 region is too busy, so I lookup the memstore
> >> metrics in web:
> >>>>
> >>>>
> >>>> As you see, the user9099 is bigger than other region, I think it is
> >> flushing, but after a while, it does not change to a small size and YCSB
> >> quit finally.
> >>>>
> >>>> But when i change the concurrency threads to 4, all is right. I want
> to
> >> know why?
> >>>>
> >>>> Any idea will be appreciated.
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
>
>

Re: YCSB load failed because hbase region too busy

Reply via email to