On Sun, Oct 30, 2011 at 5:53 PM, Vladimir Rodionov <[email protected]> wrote: > Yes, mslab is enabled. It allocates 2M per region, by default? It can explain > 2.4G of heap usage (1277 regions) > We fixed OOME by increasing Heap to 8G. Now I know that we can decrease slab > size and get back to 4G.
Yep. You can decrease slab size or disable MSLAB entirely. Or consider having fewer, larger regions per server. -Todd > > ________________________________________ > From: Todd Lipcon [[email protected]] > Sent: Sunday, October 30, 2011 5:12 PM > To: [email protected] > Subject: Re: HBASE-4107 and OOME > > Hi Vladimir, > > Do you have MSLAB enabled? My guess is with 1000 regions you're seeing > a lot of memory usage from MSLAB. Can you try the patch from > HBASE-3680 to see what the "wasted memory" from MSLABs are? > > -Todd > > On Sun, Oct 30, 2011 at 4:54 PM, Vladimir Rodionov > <[email protected]> wrote: >> >> We have observing frequent OOME during test load on a small cluster (15 >> nodes), but the number of regions is >> quite high (~1K per region server) >> >> It seems that we are hitting constantly HBASE-4107 bug. >> >> 2011-10-29 07:23:19,963 INFO org.apache.hadoop.io.compress.CodecPool: Got >> brand-new compressor >> 2011-10-29 07:23:23,171 DEBUG >> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=30.07 MB, >> free=764.53 MB, max=794.6 MB, blocks=418, accesses=198528211, >> hits=196714784, hitRatio=99.08%%, cachingAccesses=196715094, >> cachingHits=196714676, cachingHitsRatio=99.99%%, evictions=0, evicted=0, >> evictedPerRun=NaN >> 2011-10-29 07:23:55,858 INFO org.apache.hadoop.io.compress.CodecPool: Got >> brand-new compressor >> 2011-10-29 07:26:43,776 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: >> Could not append. Requesting close of hlog >> java.io.IOException: Reflection >> at >> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147) >> at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1002) >> at >> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:979) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145) >> ... 2 more >> Caused by: java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2204) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3086) >> at >> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150) >> at >> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3169) >> at >> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) >> at >> org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944) >> ... 6 more >> 2011-10-29 07:26:43,809 INFO org.apache.hadoop.io.compress.CodecPool: Got >> brand-new compressor >> >> The most interesting part is heap dump analysis: >> >> Heap Size = 4G >> >> byte[] consume 86% of heap >> 68% of overall Heap is accessible from MemStore instances >> MemStore-> KeyValueSkipListSet -> ConcurrentSkip >> >> I am not saying that this is a memory leak but taking into account that >> MemStore default size is 40% of 4G = 1.6G >> 68% looks very suspicious. >> >> >> Best regards, >> Vladimir Rodionov >> Principal Platform Engineer >> Carrier IQ, www.carrieriq.com >> e-mail: [email protected] >> >> ________________________________________ >> From: Ted Yu [[email protected]] >> Sent: Saturday, October 29, 2011 3:56 PM >> To: [email protected] >> Subject: test failure due to missing baseznode >> >> If you happen to see test failure similar to the following: >> https://builds.apache.org/job/PreCommit-HBASE-Build/99//testReport/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFilesSplitRecovery/testBulkLoadPhaseRecovery/ >> >> Please go over https://issues.apache.org/jira/browse/HBASE-4253 >> >> and apply similar fix as the following: >> https://issues.apache.org/jira/secure/attachment/12491621/HBASE-4253.patch >> >> Cheers >> >> Confidentiality Notice: The information contained in this message, >> including any attachments hereto, may be confidential and is intended to be >> read only by the individual or entity to whom this message is addressed. If >> the reader of this message is not the intended recipient or an agent or >> designee of the intended recipient, please note that any review, use, >> disclosure or distribution of this message or its attachments, in any form, >> is strictly prohibited. If you have received this message in error, please >> immediately notify the sender and/or [email protected] and delete >> or destroy any copy of this message and its attachments. >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > Confidentiality Notice: The information contained in this message, including > any attachments hereto, may be confidential and is intended to be read only > by the individual or entity to whom this message is addressed. If the reader > of this message is not the intended recipient or an agent or designee of the > intended recipient, please note that any review, use, disclosure or > distribution of this message or its attachments, in any form, is strictly > prohibited. If you have received this message in error, please immediately > notify the sender and/or [email protected] and delete or destroy > any copy of this message and its attachments. > -- Todd Lipcon Software Engineer, Cloudera
