Hi Vladimir,

Do you have MSLAB enabled? My guess is with 1000 regions you're seeing
a lot of memory usage from MSLAB. Can you try the patch from
HBASE-3680 to see what the "wasted memory" from MSLABs are?

-Todd

On Sun, Oct 30, 2011 at 4:54 PM, Vladimir Rodionov
<[email protected]> wrote:
>
> We have observing frequent OOME during test load on a small cluster (15 
> nodes), but the number of regions is
> quite high (~1K per region server)
>
> It seems that we are hitting constantly HBASE-4107 bug.
>
> 2011-10-29 07:23:19,963 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new compressor
> 2011-10-29 07:23:23,171 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=30.07 MB, free=764.53 MB, max=794.6 MB, blocks=418, 
> accesses=198528211, hits=196714784, hitRatio=99.08%%, 
> cachingAccesses=196715094, cachingHits=196714676, cachingHitsRatio=99.99%%, 
> evictions=0, evicted=0, evictedPerRun=NaN
> 2011-10-29 07:23:55,858 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new compressor
> 2011-10-29 07:26:43,776 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: 
> Could not append. Requesting close of hlog
> java.io.IOException: Reflection
>        at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147)
>        at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1002)
>        at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:979)
> Caused by: java.lang.reflect.InvocationTargetException
>        at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
>        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145)
>        ... 2 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2204)
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3086)
>        at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
>        at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3169)
>        at 
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>        at 
> org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944)
>        ... 6 more
> 2011-10-29 07:26:43,809 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new compressor
>
> The most interesting part is heap dump analysis:
>
> Heap Size = 4G
>
> byte[] consume 86% of heap
> 68% of overall Heap is accessible from MemStore instances
> MemStore-> KeyValueSkipListSet -> ConcurrentSkip
>
> I am not saying that this is a memory leak but taking into account that 
> MemStore default size is 40% of 4G = 1.6G
> 68% looks very suspicious.
>
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
> From: Ted Yu [[email protected]]
> Sent: Saturday, October 29, 2011 3:56 PM
> To: [email protected]
> Subject: test failure due to missing baseznode
>
> If you happen to see test failure similar to the following:
> https://builds.apache.org/job/PreCommit-HBASE-Build/99//testReport/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFilesSplitRecovery/testBulkLoadPhaseRecovery/
>
> Please go over https://issues.apache.org/jira/browse/HBASE-4253
>
> and apply similar fix as the following:
> https://issues.apache.org/jira/secure/attachment/12491621/HBASE-4253.patch
>
> Cheers
>
> Confidentiality Notice:  The information contained in this message, including 
> any attachments hereto, may be confidential and is intended to be read only 
> by the individual or entity to whom this message is addressed. If the reader 
> of this message is not the intended recipient or an agent or designee of the 
> intended recipient, please note that any review, use, disclosure or 
> distribution of this message or its attachments, in any form, is strictly 
> prohibited.  If you have received this message in error, please immediately 
> notify the sender and/or [email protected] and delete or destroy 
> any copy of this message and its attachments.
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to