RE: Simple OOM crash?

Sandy Pratt Thu, 16 Dec 2010 16:04:00 -0800

The LZO jar installed is:

hadoop-lzo-0.4.6.jar


The native LZO libs are from EPEL (I think) installed on Centos 5.5 64 bit:

[had...@ets-lax-prod-hadoop-02 Linux-amd64-64]$ yum info lzo-devel
Name       : lzo-devel
Arch       : x86_64
Version    : 2.02
Release    : 2.el5.1
Size       : 144 k
Repo       : installed
Summary    : Development files for the lzo library
URL        : http://www.oberhumer.com/opensource/lzo/
License    : GPL
Description: LZO is a portable lossless data compression library written in 
ANSI C.
           : It offers pretty fast compression and very fast decompression.
           : This package contains development files needed for lzo.

Is the direct buffer used only with LZO, or is it always involved with HBase 
read/writes?

Thanks for the help,
Sandy


-----Original Message-----
From: Ryan Rawson [mailto:[email protected]] 
Sent: Thursday, December 16, 2010 15:50
To: [email protected]
Cc: Cosmin Lehene
Subject: Re: Simple OOM crash?

What LZO version are you using?  You aren't running out of regular heap, you 
are running out of "Direct buffer memory" which is capped to prevent mishaps.  
There is a flag to increase that size:

-XX:MaxDirectMemorySize=100m

etc

enjoy,
-ryan

On Thu, Dec 16, 2010 at 3:07 PM, Sandy Pratt <[email protected]> wrote:
> Hello HBasers,
>
> I had a regionserver crash recently, and in perusing the logs it looks like 
> it simply had a bit too little memory.  I'm running with 2200 MB heap on 
> reach regionserver.  I plan to shave a bit off the child VM allowance in 
> favor of the regionserver to correct this, probably bringing it up to 2500 
> MB.  My question is if there is any more specific memory allocation I should 
> make rather than simply giving more to the RS.  I wonder about this because 
> of the following:
>
> load=(requests=0, regions=709, usedHeap=1349, maxHeap=2198)
>
> which suggests to me that there was heap available, but the RS couldn't use 
> it for some reason.
>
> Conjecture: I do run with LZO compression, so I wonder if I could be hitting 
> that memory leak referenced earlier on the list.  I know there's a new 
> version of the LZO library available that I should upgrade to, but is it also 
> possible to simply alter the table to gzip compression and do a major 
> compaction, then uninstall LZO once that completes?
>
> Log follows:
>
> 2010-12-15 20:01:05,239 INFO 
> org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on 
> region ets.events,36345112f5654a29b308014f89c108e6,12798158203
> 11.1063152548
> 2010-12-15 20:01:05,239 DEBUG 
> org.apache.hadoop.hbase.regionserver.Store: Major compaction triggered 
> on store f1; time since last major compaction 119928149ms
> 2010-12-15 20:01:05,240 INFO 
> org.apache.hadoop.hbase.regionserver.Store: Started compaction of 2 
> file(s) in f1 of ets.events,36345112f5654a29b308014f89c108e6,12
> 79815820311.1063152548  into 
> hdfs://ets-lax-prod-hadoop-01.corp.adobe.com:54310/hbase/ets.events/10
> 63152548/.tmp, sequenceid=25718885315
> 2010-12-15 20:01:19,403 WARN 
> org.apache.hadoop.hbase.regionserver.Store: Not in 
> setorg.apache.hadoop.hbase.regionserver.storescan...@7466c84
> 2010-12-15 20:01:19,572 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Aborting region 
> server serverName=ets-lax-prod-hadoop-02.corp.adobe.com,60020,
> 1289682554219, load=(requests=0, regions=709, usedHeap=1349, 
> maxHeap=2198): Uncaught exception in service thread 
> regionserver60020.compactor
> java.lang.OutOfMemoryError: Direct buffer memory
>        at java.nio.Bits.reserveMemory(Bits.java:656)
>        at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113)
>        at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
>        at 
> com.hadoop.compression.lzo.LzoCompressor.init(LzoCompressor.java:223)
>        at 
> com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:207
> )
>        at 
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1
> 05)
>        at 
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:1
> 12)
>        at 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(C
> ompression.java:198)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFi
> le.java:391)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:377)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile
> .java:348)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:530)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:495)
>        at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile
> .java:817)
>        at 
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:811)
>        at 
> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:670)
>        at 
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav
> a:722)
>        at 
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav
> a:671)
>        at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSpl
> itThread.java:84)
> 2010-12-15 20:01:19,586 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: 
> request=0.0, regions=709, stores=709, storefiles=731, 
> storefileIndexSize=418, memstoreSize=33, compactionQueueSize=15, 
> usedHeap=856, maxHeap=2198, blockCacheSize=366779472, 
> blockCacheFree=87883088, blockCacheCount=5494, blockCacheHitRatio=0
> 2010-12-15 20:01:20,571 INFO org.apache.hadoop.ipc.HBaseServer: 
> Stopping server on 60020
>
> Thanks,
>
> Sandy
>
>

RE: Simple OOM crash?

Reply via email to