Hey everyone, Just wanted to let you know that I will be looking into this this coming week - we've marked it as an important thing to investigate prior t our next beta release.
Thanks -Todd On Sat, Jan 8, 2011 at 4:59 AM, Tatsuya Kawano <[email protected]>wrote: > > Hi Friso, > > So you found HBase 0.89 on CDH3b2 doesn't have the problem. I wonder what > would happen if you replace hadoop-core-*.jar in CDH3b3 with the one > contained in HBase 0.90RC distribution > (hadoop-core-0.20-append-r1056497.jar) and then rebuild hadoop-lzo against > it. > > Here is the comment on the LzoCompressor#reinit() method: > > ----------------------------------- > // ... this method isn't in vanilla 0.20.2, but is in CDH3b3 and YDH > public void reinit(Configuration conf) { > ----------------------------------- > > > https://github.com/kevinweil/hadoop-lzo/blob/6cbf4e232d7972c94107600567333a372ea08c0a/src/java/com/hadoop/compression/lzo/LzoCompressor.java#L196 > > > I don't know if hadoop-core-0.20-append-r1056497.jar is a vanilla 0.20.2 or > more like CDH3b3. Maybe I'm wrong, but if it doesn't call reinit(), you'll > have a good chance to get a stable HBase 0.90. > > Good luck! > > Tatsuya > > -- > Tatsuya Kawano (Mr.) > Tokyo, Japan > > http://twitter.com/#!/tatsuya6502 > > > > > On 01/08/2011, at 6:33 PM, Friso van Vollenhoven wrote: > > > Hey Ryan, > > I went back to the older version. Problem is that going to HBase 0.90 > requires a API change on the compressor side, which forces you to a version > newer than 0.4.6 or so. So I also had to go back to HBase 0.89, which is > again not compatible with CDH3b3, so I am back on CDH3b2 again. HBase 0.89 > is stable for us, so this is not at all a problem. But this LZO problem is > really in the way of our projected upgrade path (my client would like to end > up with CDH3 everything in the end, because of the support options available > in case things go wrong and the Cloudera administration courses available > when new ops people are hired). > > > > Cheers, > > Friso > > > > > > > > On 7 jan 2011, at 22:28, Ryan Rawson wrote: > > > >> Hey, > >> > >> Here at SU we continue to use version 0.1.0 of hadoop-gpl-compression. > >> I know some of the newer versions had bugs which leaked > >> DirectByteBuffer space, which might be what you are running in to. > >> > >> Give the older version a shot, there really hasnt been much in the way > >> of how LZO works in a while, most of the 'extra' stuff added was to > >> support features hbase does not use. > >> > >> Good luck! > >> > >> -ryan > >> > >> ps: http://code.google.com/p/hadoop-gpl-compression/downloads/list > >> > >> > >> On Wed, Jan 5, 2011 at 10:26 PM, Friso van Vollenhoven > >> <[email protected]> wrote: > >>> Thanks Sandy. > >>> > >>> Does setting -XX:MaxDirectMemorySize help in triggering GC when you're > reaching that limit? Or does it just OOME before the actual RAM is exhausted > (then you prevent swapping, which is nicer, though)? > >>> > >>> I guess LZO is not a solution that fits all, but we do a lot of random > reads and latency can be an issue for us, so I suppose we have to stick with > it. > >>> > >>> > >>> Friso > >>> > >>> > >>> > >>> On 5 jan 2011, at 20:36, Sandy Pratt wrote: > >>> > >>>> I was in a similar situation recently, with similar symptoms, and I > experienced a crash very similar to yours. I don't have the specifics handy > at the moment, but I did post to this list about it a few weeks ago. My > workload is fairly write-heavy. I write about 10-20 million smallish > protobuf/xml blobs per day to an HBase cluster of 12 very underpowered > machines. > >>>> > >>>> The suggestions I received were two: 1) update to the latest > hadoop-lzo and 2) specify a max direct memory size to the JVM (e.g. > -XX:MaxDirectMemorySize=256m). > >>>> > >>>> I took a third route - change my tables back to gz compression for the > time being while I figure out what to do. Since then, my memory usage has > been rock steady, but more importantly my tables are roughly half the size > on disk that they were with LZO, and there has been no noticeable drop in > performance (but remember this is a write heavy workload, I'm not trying to > serve an online workload with low latency or anything like that). At this > point, I might not return to LZO. > >>>> > >>>> In general, I'm not convinced that "use LZO" is universally good > advice for all HBase users. For one thing, I think it assumes that all > installations are focused towards low latency, which is not always the case > (sometimes merely good latency is enough and great latency is not needed). > Secondly, it assumes some things about where the performance bottleneck > lives. For example, LZO performs well in micro-benchmarks, but if you find > yourself in an IO-bound batch processing situation, you might be better > served by a higher compression ratio, even if it's more computationally > expensive. > >>>> > >>>> Sandy > >>>> > >>>>> -----Original Message----- > >>>>> From: Friso van Vollenhoven [mailto:[email protected]] > >>>>> Sent: Tuesday, January 04, 2011 08:00 > >>>>> To: <[email protected]> > >>>>> Subject: Re: problem with LZO compressor on write only loads > >>>>> > >>>>> I ran the job again, but with less other processes running on the > same > >>>>> machine, so with more physical memory available to HBase. This was to > see > >>>>> whether there was a point where it would stop allocating more > buffers. > >>>>> When I do this, after many hours, one of the RSes crashed with a > OOME. See > >>>>> here: > >>>>> > >>>>> 2011-01-04 11:32:01,332 FATAL > >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region > >>>>> server serverName=w5r1.inrdb.ripe.net,60020,1294091507228, > >>>>> load=(requests=6246, regions=258, usedHeap=1790, maxHeap=16000): > >>>>> Uncaught exception in service thread regionserver60020.compactor > >>>>> java.lang.OutOfMemoryError: Direct buffer memory > >>>>> at java.nio.Bits.reserveMemory(Bits.java:633) > >>>>> at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:98) > >>>>> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) > >>>>> at > >>>>> com.hadoop.compression.lzo.LzoCompressor.init(LzoCompressor.java:248) > >>>>> at > >>>>> > com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:207 > >>>>> ) > >>>>> at > >>>>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java: > >>>>> 105) > >>>>> at > >>>>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java: > >>>>> 112) > >>>>> at > >>>>> > org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(C > >>>>> ompression.java:200) > >>>>> at > >>>>> > org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile > >>>>> .java:397) > >>>>> at > >>>>> > org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:383) > >>>>> at > >>>>> > org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile.ja > >>>>> va:354) > >>>>> at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:536) > >>>>> at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) > >>>>> at > >>>>> > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.j > >>>>> ava:836) > >>>>> at > >>>>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:931) > >>>>> at > >>>>> org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:732) > >>>>> at > >>>>> > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav > >>>>> a:764) > >>>>> at > >>>>> > org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.jav > >>>>> a:709) > >>>>> at > >>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSp > >>>>> litThread.java:81) > >>>>> 2011-01-04 11:32:01,369 INFO > >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > >>>>> request=0.0, regions=258, stores=516, storefiles=186, > >>>>> storefileIndexSize=179, memstoreSize=2125, compactionQueueSize=2, > >>>>> usedHeap=1797, maxHeap=16000, blockCacheSize=55051488, > >>>>> blockCacheFree=6655834912, blockCacheCount=0, blockCacheHitCount=0, > >>>>> blockCacheMissCount=2397107, blockCacheEvictedCount=0, > >>>>> blockCacheHitRatio=0, blockCacheHitCachingRatio=0 > >>>>> > >>>>> I am guessing the OS won't allocate any more memory to the process. > As you > >>>>> can see, the used heap is nowhere near the max heap. > >>>>> > >>>>> Also, this happens from the compaction, it seems. I had not > considered those > >>>>> as a suspect yet. I could try running with a larger compaction > threshold and > >>>>> blocking store files. Since this is a write only load, it should not > be a problem. > >>>>> In our normal operation, compactions and splits are quite common, > though, > >>>>> because we do read-modify-write cycles a lot. Anyone else doing > update > >>>>> heavy work with LZO? > >>>>> > >>>>> > >>>>> Cheers, > >>>>> Friso > >>>>> > >>>>> > >>>>> On 4 jan 2011, at 01:54, Todd Lipcon wrote: > >>>>> > >>>>>> Fishy. Are your cells particularly large? Or have you tuned the > HFile > >>>>>> block size at all? > >>>>>> > >>>>>> -Todd > >>>>>> > >>>>>> On Mon, Jan 3, 2011 at 2:15 PM, Friso van Vollenhoven < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>>> I tried it, but it doesn't seem to help. The RS processes grow to > >>>>>>> 30Gb in minutes after the job started. > >>>>>>> > >>>>>>> Any ideas? > >>>>>>> > >>>>>>> > >>>>>>> Friso > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 3 jan 2011, at 19:18, Todd Lipcon wrote: > >>>>>>> > >>>>>>>> Hi Friso, > >>>>>>>> > >>>>>>>> Which OS are you running? Particularly, which version of glibc? > >>>>>>>> > >>>>>>>> Can you try running with the environment variable > >>>>> MALLOC_ARENA_MAX=1 set? > >>>>>>>> > >>>>>>>> Thanks > >>>>>>>> -Todd > >>>>>>>> > >>>>>>>> On Mon, Jan 3, 2011 at 8:15 AM, Friso van Vollenhoven < > >>>>>>>> [email protected]> wrote: > >>>>>>>> > >>>>>>>>> Hi all, > >>>>>>>>> > >>>>>>>>> I seem to run into a problem that occurs when using LZO > compression > >>>>>>>>> on a heavy write only load. I am using 0.90 RC1 and, thus, the > LZO > >>>>>>>>> compressor code that supports the reinit() method (from Kevin > >>>>>>>>> Weil's github, > >>>>>>> version > >>>>>>>>> 0.4.8). There are some more Hadoop LZO incarnations, so I am > >>>>>>>>> pointing my question to this list. > >>>>>>>>> > >>>>>>>>> It looks like the compressor uses direct byte buffers to store > the > >>>>>>> original > >>>>>>>>> and compressed bytes in memory, so the native code can work with > it > >>>>>>> without > >>>>>>>>> the JVM having to copy anything around. The direct buffers are > >>>>>>>>> possibly reused after a reinit() call, but will often be newly > >>>>>>>>> created in the > >>>>>>> init() > >>>>>>>>> method, because the existing buffer can be the wrong size for > reusing. > >>>>>>> The > >>>>>>>>> latter case will leave the previously used buffers by the > >>>>>>>>> compressor instance eligible for garbage collection. I think the > >>>>>>>>> problem is that > >>>>>>> this > >>>>>>>>> collection never occurs (in time), because the GC does not > consider > >>>>>>>>> it necessary yet. The GC does not know about the native heap and > >>>>>>>>> based on > >>>>>>> the > >>>>>>>>> state of the JVM heap, there is no reason to finalize these > objects yet. > >>>>>>>>> However, direct byte buffers are only freed in the finalizer, so > >>>>>>>>> the > >>>>>>> native > >>>>>>>>> heap keeps growing. On write only loads, a full GC will rarely > >>>>>>>>> happen, because the max heap will not grow far beyond the mem > >>>>>>>>> stores (no block > >>>>>>> cache > >>>>>>>>> is used). So what happens is that the machine starts using swap > >>>>>>>>> before > >>>>>>> the > >>>>>>>>> GC will ever clean up the direct byte buffers. I am guessing that > >>>>>>> without > >>>>>>>>> the reinit() support, the buffers were collected earlier because > >>>>>>>>> the referring objects would also be collected every now and then > or > >>>>>>>>> things > >>>>>>> would > >>>>>>>>> perhaps just never promote to an older generation. > >>>>>>>>> > >>>>>>>>> When I do a pmap on a running RS after it has grown to some 40Gb > >>>>>>> resident > >>>>>>>>> size (with a 16Gb heap), it will show a lot of near 64M anon > blocks > >>>>>>>>> (presumably native heap). I show this before with the 0.4.6 > version > >>>>>>>>> of Hadoop LZO, but that was under normal load. After that I went > >>>>>>>>> back to a HBase version that does not require the reinit(). Now I > >>>>>>>>> am on 0.90 with > >>>>>>> the > >>>>>>>>> new LZO, but never did a heavy load like this one with that, > until > >>>>>>> now... > >>>>>>>>> > >>>>>>>>> Can anyone with a better understanding of the LZO code confirm > that > >>>>>>>>> the above could be the case? If so, would it be possible to > change > >>>>>>>>> the LZO compressor (and decompressor) to use maybe just one fixed > >>>>>>>>> size buffer > >>>>>>> (they > >>>>>>>>> all appear near 64M anyway) or possibly reuse an existing buffer > >>>>>>>>> also > >>>>>>> when > >>>>>>>>> it is not the exact required size but just large enough to make > do? > >>>>>>> Having > >>>>>>>>> short lived direct byte buffers is apparently a discouraged > >>>>>>>>> practice. If anyone can provide some pointers on what to look out > >>>>>>>>> for, I could invest some time in creating a patch. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Friso > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Todd Lipcon > >>>>>>>> Software Engineer, Cloudera > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Todd Lipcon > >>>>>> Software Engineer, Cloudera > >>>> > >>> > >>> > > > > > > -- Todd Lipcon Software Engineer, Cloudera
