Setting that property to false has not made any difference, hbase has just crashed again (ran out of heap) and I am busy restarting it. What do I do now?
On 29 Dec 2011, at 5:56 PM, Seraph Imalia wrote: > Thanks, > > I will try disabling it to see if the memory is being taken up by MSLAB. > > Regards, > Seraph > > On 29 Dec 2011, at 5:47 PM, Ted Yu wrote: > >> mslab was introduced after 0.20.6 >> >> Read Todd's series: >> http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/ >> >> Cheers >> >> On Thu, Dec 29, 2011 at 12:19 AM, Seraph Imalia <[email protected]> wrote: >> >>> Region Servers >>> >>> Address Start Code Load >>> dynobuntu10:60030 1325081250180 requests=43, regions=224, >>> usedHeap=3946, maxHeap=4087 >>> dynobuntu12:60030 1325081249966 requests=32, regions=224, >>> usedHeap=3821, maxHeap=4087 >>> dynobuntu17:60030 1325081248407 requests=39, regions=225, >>> usedHeap=4016, maxHeap=4087 >>> Total: servers: 3 requests=114, regions=673 >>> >>> I restarted them yesterday and the number of regions increased from 667 to >>> 673 and they are about to run out of heap again :(. Should I set that >>> property to false? - what does mslab do? - is it new after 0.20.6? >>> >>> Regards, >>> Seraph >>> >>> On 28 Dec 2011, at 5:46 PM, Ted Yu wrote: >>> >>>> Can you tell me how many regions each region server hosts ? >>>> >>>> In 0.90.4 there is this parameter: >>>> <name>hbase.hregion.memstore.mslab.enabled</name> >>>> <value>true</value> >>>> mslab tends to consume heap if region count is high. >>>> >>>> Cheers >>>> >>>> On Wed, Dec 28, 2011 at 6:27 AM, Seraph Imalia <[email protected]> >>> wrote: >>>> >>>>> Hi Guys, >>>>> >>>>> After updating from 0.20.6 to 0.90.4, we have been having serious RAM >>>>> issues. I had hbase-env.sh set to use 3 Gigs of RAM with 0.20.6 but >>> with >>>>> 0.90.4 even 4.5 Gigs seems not enough. It does not matter how much load >>>>> the hbase services are under, it just crashes after 24-48 hours. The >>> only >>>>> difference the load makes is how quickly the services crash. Even over >>>>> this holiday season with our lowest load of the year, it crashes just >>> after >>>>> 36 hours of being started. To fix it, I have to run the stop-hbase.sh >>>>> command, wait a while and kill -9 any hbase processes that have stopped >>>>> outputting logs or stopped responding, and then run start-hbase.sh >>> again. >>>>> >>>>> Attached are my logs from the latest "start-to-crash". There are 3 >>>>> servers and hbase is being used for storing URL's - 7 client servers >>>>> connect to hbase and perform URL Lookups at about 40 requests per second >>>>> (this is the low load over this holiday season). If the URL does not >>>>> exist, it gets added. The Key on the HTable is the URL and there are a >>> few >>>>> fields stored against it - e.g. DateDiscovered, Host, Script, >>> QueryString, >>>>> etc. >>>>> >>>>> Each server has a hadoop datanode and an hbase regionserver and 1 of the >>>>> servers additionally has the namenode, master and zookeeper. On first >>>>> start, each regionserver uses 2 Gigs (usedHeap) and as soon as I restart >>>>> the clients, the usedHeap slowly climes until it reaches the maxHeap and >>>>> shortly after that, the regionservers start crashing - sometimes they >>>>> actually shutdown gracefully by themselves. >>>>> >>>>> Originally, we had hbase.regionserver.handler.count set to 100 and I >>> have >>>>> now removed that to leave it as default which has not helped. >>>>> >>>>> We have not made any changes to the clients and we have a mirrored >>>>> instance of this in our UK Data Centre which is still running 0.20.6 and >>>>> servicing 10 clients currently at over 300 requests per second (again >>> low >>>>> load over the holidays) and it is 100% stable. >>>>> >>>>> What do I do now? - your website says I cannot downgrade? >>>>> >>>>> Please help >>>>> >>>>> Regards, >>>>> Seraph >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> >>> >
