Hey Todd, Hopefully I can get to this somewhere next week. We have had our NN corrupted, so we are rebuilding the prod cluster, meaning we use dev for backing our apps now, so I have no environment to give it a go. Stay tuned...
>> Yea, you're definitely on the right track. Have you considered systems >> programming, Friso? :) > Well, at least then you get to do your own memory management most of the time... Friso > Can someone who is having this issue try checking out the following git > branch and rebuilding LZO? > > https://github.com/toddlipcon/hadoop-lzo/tree/realloc > > This definitely stems one leak of a 64KB directbuffer on every reinit. > > -Todd > > On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[email protected]> wrote: > >> Yea, you're definitely on the right track. Have you considered systems >> programming, Friso? :) >> >> Hopefully should have a candidate patch to LZO later today. >> >> -Todd >> >> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < >> [email protected]> wrote: >> >>> Hi, >>> My guess is indeed that it has to do with using the reinit() method on >>> compressors and making them long lived instead of throwaway together with >>> the LZO implementation of reinit(), which magically causes NIO buffer >>> objects not to be finalized and as a result not release their native >>> allocations. It's just theory and I haven't had the time to properly verify >>> this (unfortunately, I spend most of my time writing application code), but >>> Todd said he will be looking into it further. I browsed the LZO code to see >>> what was going on there, but with my limited knowledge of the HBase code it >>> would be bald to say that this is for sure the case. It would be my first >>> direction of investigation. I would add some logging to the LZO code where >>> new direct byte buffers are created to log how often that happens and what >>> size they are and then redo the workload that shows the leak. Together with >>> some profiling you should be able to see how long it takes for these get >>> finalized. >>> >>> Cheers, >>> Friso >>> >>> >>> >>> On 12 jan 2011, at 20:08, Stack wrote: >>> >>>> 2011/1/12 Friso van Vollenhoven <[email protected]>: >>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the >>> problem. Compressing the map output using LZO works just fine. The problem >>> is HBase LZO compression. The region server process is the one with the >>> memory leak... >>>>> >>>> >>>> (Sorry for dumb question Friso) But HBase is leaking because we make >>>> use of the Compression API in a manner that produces leaks? >>>> Thanks, >>>> St.Ack >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
