Hi Todd,

Can you please give the URL of this fix?

Thanks,
Sean

On Sat, Nov 13, 2010 at 9:10 PM, Todd Lipcon <[email protected]> wrote:

> Hi Friso,
>
> I think I identified the issue. As you suspected, we were unnecessarily
> allocating a lot of native byte buffers in the LZO code where we weren't
> before.
>
> I just pushed a fix to my LZO repository and bumped the version number to
> 0.4.7.
>
> If you have a chance to test this on a dev environment that would be great.
> I will try to test myself this week. (unfortunately I wasn't able to
> reproduce the issue yet)
>
> Thanks
> -Todd
>
> On Fri, Nov 12, 2010 at 4:09 PM, Todd Lipcon <[email protected]> wrote:
>
> > Hey Friso,
> >
> > Thanks so much for the details. I am starting to imagine it could indeed
> be
> > a codec leak - especially since you have some cells which are into the
> MB,
> > maybe it's expanding some buffers to 64MB.
> >
> > Let me try to do some tests to reproduce it here in the next week or so.
> >
> > Anyone else seen this issue?
> >
> > Thanks
> > -Todd
> >
> > On Fri, Nov 12, 2010 at 1:19 AM, Friso van Vollenhoven <
> > [email protected]> wrote:
> >
> >> Hi Todd,
> >>
> >> I am afraid I no longer have the broken setup around, because we really
> >> need a working one right now. We need to demo at a conference next week
> and
> >> until after that, all changes are frozen both on dev and prod (so we can
> use
> >> dev as fall back). Later on I could maybe try some more things on our
> dev
> >> boxes.
> >>
> >> If you are doing a repro, here's the stuff you'd probably want to know:
> >> The workload is write only. No reads happening at the same time. No
> other
> >> active clients. It is an initial import of data. We do insertions in a
> MR
> >> job from the reducers. The total volume is about 11 billion puts across
> >> roughly 450K rows per table (we have a many columns per row data model)
> >> across 15 tables, all use LZO. Qualifiers are some 50 bytes. Values
> range
> >> from a small number of KBs generally to MBs in rare cases. The row keys
> have
> >> a time-related part at the start, so I know the keyspace in advance, so
> I
> >> create the empty tables with pre-created regions (40 regions) across the
> >> keyspace to get decent distribution from the start of the job. In order
> to
> >> not overload HBase, I run the job with only 15 reducers, so there are
> max 15
> >> concurrent clients active. Other settnigs: max file size is 1GB, HFile
> block
> >> size is default 64K, client side buffer is 16M, memstore flush size is
> 128M,
> >> compaction threshold is 5, blocking store files is 9, mem store upper
> limit
> >> is 20%, lower limit 15%, block cache 40%. During the run, the RSes never
> >> report more than 5GB of heap usage from the UI, which makes sense,
> because
> >> block cache is not touched. On a healthy run with somewhat conservative
> >> settings right now, HBase reports on average about 380K requests per
> second
> >> in the master UI.
> >>
> >> The cluster has 8 workers running TT, DN, RS and another JVM process for
> >> our own software that sits in front of HBase. Workers are dual quad
> cores
> >> with 64GB RAM and 10x 600GB disks (we decided to scale the amount of
> seeks
> >> we can do concurrently). Disks are quite fast: 10K RPM. MR task VMs get
> 1GB
> >> of heap, TT and DN also. RS gets 16GB of heap and our own software too.
> We
> >> run 8 mappers and 4 reducers per node. So at absolute max, we should
> have
> >> 46GB of allocated heap. That leaves 18GB for JVM overhead, native
> >> allocations and OS. We run Linux 2.6.18-194.11.4.el5. I think it is
> CentOS,
> >> but I didn't do the installs myself.
> >>
> >> I tried numerous different settings both more extreme and more
> >> conservative to get the thing working, but in the end it always ends up
> >> swapping. I should have tried a run without LZO, of course, but I was
> out of
> >> time by then.
> >>
> >>
> >>
> >> Cheers,
> >> Friso
> >>
> >>
> >>
> >> On 12 nov 2010, at 07:06, Todd Lipcon wrote:
> >>
> >> > Hrm, any chance you can run with a smaller heap and get a jmap dump?
> The
> >> > eclipse MAT tool is also super nice for looking at this stuff if
> indeed
> >> they
> >> > are java objects.
> >> >
> >> > What kind of workload are you using? Read mostly? Write mostly? Mixed?
> I
> >> > will try to repro.
> >> >
> >> > -Todd
> >> >
> >> > On Thu, Nov 11, 2010 at 8:41 PM, Friso van Vollenhoven <
> >> > [email protected]> wrote:
> >> >
> >> >> I figured the same. I also did a run with CMS instead of G1. Same
> >> results.
> >> >>
> >> >> I also did a run with the RS heap tuned down to 12GB and 8GB, but
> given
> >> >> enough time the process still grows over 40GB in size.
> >> >>
> >> >>
> >> >> Friso
> >> >>
> >> >>
> >> >>
> >> >> On 12 nov 2010, at 01:55, Todd Lipcon wrote:
> >> >>
> >> >>> Can you try running this with CMS GC instead of G1GC? G1 still has
> >> some
> >> >>> bugs... 64M sounds like it might be G1 "regions"?
> >> >>>
> >> >>> -Todd
> >> >>>
> >> >>> On Thu, Nov 11, 2010 at 2:07 AM, Friso van Vollenhoven <
> >> >>> [email protected]> wrote:
> >> >>>
> >> >>>> Hi All,
> >> >>>>
> >> >>>> (This is all about CDH3, so I am not sure whether it should go on
> >> this
> >> >>>> list, but I figure it is at least interesting for people trying the
> >> >> same.)
> >> >>>>
> >> >>>> I've recently tried CDH3 on a new cluster from RPMs with the
> >> hadoop-lzo
> >> >>>> fork from https://github.com/toddlipcon/hadoop-lzo. Everything
> works
> >> >> like
> >> >>>> a charm initially, but after some time (minutes to max one hour),
> the
> >> RS
> >> >> JVM
> >> >>>> process memory grows to more than twice the given heap size and
> >> beyond.
> >> >> I
> >> >>>> have seen a RS with 16GB heap that grows to 55GB virtual size. At
> >> some
> >> >>>> point, everything start swapping and GC times go into the minutes
> and
> >> >>>> everything dies or is considered dead by the master.
> >> >>>>
> >> >>>> I did a pmap -x on the RS process and that shows a lot of allocated
> >> >> blocks
> >> >>>> of about 64M by the process. There about 500 of these, which is
> 32GB
> >> in
> >> >>>> total. See: http://pastebin.com/8pgzPf7b (bottom of the file, the
> >> >> blocks
> >> >>>> of about 1M on top are probably thread stacks). Unfortunately,
> Linux
> >> >> shows
> >> >>>> the native heap as anon blocks, so I can not link it to a specific
> >> lib
> >> >> or
> >> >>>> something.
> >> >>>>
> >> >>>> I am running the latest CDH3 and hadoop-lzo 0.4.6 (from said URL,
> the
> >> >> one
> >> >>>> which has the reinit() support). I run Java 6u21 with the G1
> garbage
> >> >>>> collector, which has been running fine for some weeks now. Full
> >> command
> >> >> line
> >> >>>> is:
> >> >>>> java -Xmx16000m -XX:+HeapDumpOnOutOfMemoryError
> >> >>>> -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC
> -XX:+UseCompressedOops
> >> >>>> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> >> >>>> -Xloggc:/export/logs/hbase/gc-hbase.log
> >> >>>> -Djava.library.path=/home/inr/java-lib/hbase/native/Linux-amd64-64
> >> >>>> -Djava.net.preferIPv4Stack=true -Dhbase.log.dir=/export/logs/hbase
> >> >>>> -Dhbase.log.file=hbase-hbase-regionserver-w3r1.inrdb.ripe.net.log
> >> >>>> -Dhbase.home.dir=/usr/lib/hbase/bin/.. -Dhbase.id.str=hbase
> -Dhbase.r
> >> >>>>
> >> >>>> I searched the HBase source for something that could point to
> native
> >> >> heap
> >> >>>> usage (like ByteBuffer#allocateDirect(...)), but I could not find
> >> >> anything.
> >> >>>> Thread count is about 185 (I have 100 handlers), so nothing strange
> >> >> there as
> >> >>>> well.
> >> >>>>
> >> >>>> Question is, could this be HBase or is this a problem with the
> >> >> hadoop-lzo?
> >> >>>>
> >> >>>> I have currently downgraded to a version known to work, because we
> >> have
> >> >> a
> >> >>>> demo coming up. But still interested in the answer.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> Regards,
> >> >>>> Friso
> >> >>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Todd Lipcon
> >> >>> Software Engineer, Cloudera
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Todd Lipcon
> >> > Software Engineer, Cloudera
> >>
> >>
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
--Sean

Reply via email to