On Tue, Nov 16, 2010 at 11:55 AM, Friso van Vollenhoven < [email protected]> wrote:
> It's the version at https://github.com/toddlipcon/hadoop-lzo that got > upped to 0.4.7 because of this fix (download and build yourself). You need > this version to go with CDH3b3. I don't know how this relates to the ASF > release / trunk of HBase. This version is a fork from > https://github.com/kevinweil/hadoop-lzo, which is what I used before (on > ASF HBase and CDH3b2). > > It's a "fork" in the github term, but in reality if you look at history both Kevin and I contribute regularly to the project and merge each others' changes. -Todd > > > <https://github.com/toddlipcon/hadoop-lzo> > On 16 nov 2010, at 19:47, Sean Bigdatafun wrote: > > Hi Todd, > > Can you please give the URL of this fix? > > Thanks, > Sean > > On Sat, Nov 13, 2010 at 9:10 PM, Todd Lipcon <[email protected]<mailto: > [email protected]>> wrote: > > Hi Friso, > > I think I identified the issue. As you suspected, we were unnecessarily > allocating a lot of native byte buffers in the LZO code where we weren't > before. > > I just pushed a fix to my LZO repository and bumped the version number to > 0.4.7. > > If you have a chance to test this on a dev environment that would be great. > I will try to test myself this week. (unfortunately I wasn't able to > reproduce the issue yet) > > Thanks > -Todd > > On Fri, Nov 12, 2010 at 4:09 PM, Todd Lipcon <[email protected]<mailto: > [email protected]>> wrote: > > Hey Friso, > > Thanks so much for the details. I am starting to imagine it could indeed > be > a codec leak - especially since you have some cells which are into the > MB, > maybe it's expanding some buffers to 64MB. > > Let me try to do some tests to reproduce it here in the next week or so. > > Anyone else seen this issue? > > Thanks > -Todd > > On Fri, Nov 12, 2010 at 1:19 AM, Friso van Vollenhoven < > [email protected]<mailto:[email protected]>> wrote: > > Hi Todd, > > I am afraid I no longer have the broken setup around, because we really > need a working one right now. We need to demo at a conference next week > and > until after that, all changes are frozen both on dev and prod (so we can > use > dev as fall back). Later on I could maybe try some more things on our > dev > boxes. > > If you are doing a repro, here's the stuff you'd probably want to know: > The workload is write only. No reads happening at the same time. No > other > active clients. It is an initial import of data. We do insertions in a > MR > job from the reducers. The total volume is about 11 billion puts across > roughly 450K rows per table (we have a many columns per row data model) > across 15 tables, all use LZO. Qualifiers are some 50 bytes. Values > range > from a small number of KBs generally to MBs in rare cases. The row keys > have > a time-related part at the start, so I know the keyspace in advance, so > I > create the empty tables with pre-created regions (40 regions) across the > keyspace to get decent distribution from the start of the job. In order > to > not overload HBase, I run the job with only 15 reducers, so there are > max 15 > concurrent clients active. Other settnigs: max file size is 1GB, HFile > block > size is default 64K, client side buffer is 16M, memstore flush size is > 128M, > compaction threshold is 5, blocking store files is 9, mem store upper > limit > is 20%, lower limit 15%, block cache 40%. During the run, the RSes never > report more than 5GB of heap usage from the UI, which makes sense, > because > block cache is not touched. On a healthy run with somewhat conservative > settings right now, HBase reports on average about 380K requests per > second > in the master UI. > > The cluster has 8 workers running TT, DN, RS and another JVM process for > our own software that sits in front of HBase. Workers are dual quad > cores > with 64GB RAM and 10x 600GB disks (we decided to scale the amount of > seeks > we can do concurrently). Disks are quite fast: 10K RPM. MR task VMs get > 1GB > of heap, TT and DN also. RS gets 16GB of heap and our own software too. > We > run 8 mappers and 4 reducers per node. So at absolute max, we should > have > 46GB of allocated heap. That leaves 18GB for JVM overhead, native > allocations and OS. We run Linux 2.6.18-194.11.4.el5. I think it is > CentOS, > but I didn't do the installs myself. > > I tried numerous different settings both more extreme and more > conservative to get the thing working, but in the end it always ends up > swapping. I should have tried a run without LZO, of course, but I was > out of > time by then. > > > > Cheers, > Friso > > > > On 12 nov 2010, at 07:06, Todd Lipcon wrote: > > Hrm, any chance you can run with a smaller heap and get a jmap dump? > The > eclipse MAT tool is also super nice for looking at this stuff if > indeed > they > are java objects. > > What kind of workload are you using? Read mostly? Write mostly? Mixed? > I > will try to repro. > > -Todd > > On Thu, Nov 11, 2010 at 8:41 PM, Friso van Vollenhoven < > [email protected]<mailto:[email protected]>> wrote: > > I figured the same. I also did a run with CMS instead of G1. Same > results. > > I also did a run with the RS heap tuned down to 12GB and 8GB, but > given > enough time the process still grows over 40GB in size. > > > Friso > > > > On 12 nov 2010, at 01:55, Todd Lipcon wrote: > > Can you try running this with CMS GC instead of G1GC? G1 still has > some > bugs... 64M sounds like it might be G1 "regions"? > > -Todd > > On Thu, Nov 11, 2010 at 2:07 AM, Friso van Vollenhoven < > [email protected]<mailto:[email protected]>> wrote: > > Hi All, > > (This is all about CDH3, so I am not sure whether it should go on > this > list, but I figure it is at least interesting for people trying the > same.) > > I've recently tried CDH3 on a new cluster from RPMs with the > hadoop-lzo > fork from https://github.com/toddlipcon/hadoop-lzo. Everything > works > like > a charm initially, but after some time (minutes to max one hour), > the > RS > JVM > process memory grows to more than twice the given heap size and > beyond. > I > have seen a RS with 16GB heap that grows to 55GB virtual size. At > some > point, everything start swapping and GC times go into the minutes > and > everything dies or is considered dead by the master. > > I did a pmap -x on the RS process and that shows a lot of allocated > blocks > of about 64M by the process. There about 500 of these, which is > 32GB > in > total. See: http://pastebin.com/8pgzPf7b (bottom of the file, the > blocks > of about 1M on top are probably thread stacks). Unfortunately, > Linux > shows > the native heap as anon blocks, so I can not link it to a specific > lib > or > something. > > I am running the latest CDH3 and hadoop-lzo 0.4.6 (from said URL, > the > one > which has the reinit() support). I run Java 6u21 with the G1 > garbage > collector, which has been running fine for some weeks now. Full > command > line > is: > java -Xmx16000m -XX:+HeapDumpOnOutOfMemoryError > -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC > -XX:+UseCompressedOops > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -Xloggc:/export/logs/hbase/gc-hbase.log > -Djava.library.path=/home/inr/java-lib/hbase/native/Linux-amd64-64 > -Djava.net.preferIPv4Stack=true -Dhbase.log.dir=/export/logs/hbase > -Dhbase.log.file=hbase-hbase-regionserver-w3r1.inrdb.ripe.net.log > -Dhbase.home.dir=/usr/lib/hbase/bin/.. -Dhbase.id.str=hbase > -Dhbase.r > > I searched the HBase source for something that could point to > native > heap > usage (like ByteBuffer#allocateDirect(...)), but I could not find > anything. > Thread count is about 185 (I have 100 handlers), so nothing strange > there as > well. > > Question is, could this be HBase or is this a problem with the > hadoop-lzo? > > I have currently downgraded to a version known to work, because we > have > a > demo coming up. But still interested in the answer. > > > > Regards, > Friso > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > > > -- > --Sean > > -- Todd Lipcon Software Engineer, Cloudera
