Friso:

Did you cc' Kevin?  He might have an idea?

Good on you,
St.Ack

On Mon, Jan 3, 2011 at 8:15 AM, Friso van Vollenhoven
<[email protected]> wrote:
> Hi all,
>
> I seem to run into a problem that occurs when using LZO compression on a 
> heavy write only load. I am using 0.90 RC1 and, thus, the LZO compressor code 
> that supports the reinit() method (from Kevin Weil's github, version 0.4.8). 
> There are some more Hadoop LZO incarnations, so I am pointing my question to 
> this list.
>
> It looks like the compressor uses direct byte buffers to store the original 
> and compressed bytes in memory, so the native code can work with it without 
> the JVM having to copy anything around. The direct buffers are possibly 
> reused after a reinit() call, but will often be newly created in the init() 
> method, because the existing buffer can be the wrong size for reusing. The 
> latter case will leave the previously used buffers by the compressor instance 
> eligible for garbage collection. I think the problem is that this collection 
> never occurs (in time), because the GC does not consider it necessary yet. 
> The GC does not know about the native heap and based on the state of the JVM 
> heap, there is no reason to finalize these objects yet. However, direct byte 
> buffers are only freed in the finalizer, so the native heap keeps growing. On 
> write only loads, a full GC will rarely happen, because the max heap will not 
> grow far beyond the mem stores (no block cache is used). So what happens is 
> that the machine starts using swap before the GC will ever clean up the 
> direct byte buffers. I am guessing that without the reinit() support, the 
> buffers were collected earlier because the referring objects would also be 
> collected every now and then or things would perhaps just never promote to an 
> older generation.
>
> When I do a pmap on a running RS after it has grown to some 40Gb resident 
> size (with a 16Gb heap), it will show a lot of near 64M anon blocks 
> (presumably native heap). I show this before with the 0.4.6 version of Hadoop 
> LZO, but that was under normal load. After that I went back to a HBase 
> version that does not require the reinit(). Now I am on 0.90 with the new 
> LZO, but never did a heavy load like this one with that, until now...
>
> Can anyone with a better understanding of the LZO code confirm that the above 
> could be the case? If so, would it be possible to change the LZO compressor 
> (and decompressor) to use maybe just one fixed size buffer (they all appear 
> near 64M anyway) or possibly reuse an existing buffer also when it is not the 
> exact required size but just large enough to make do? Having short lived 
> direct byte buffers is apparently a discouraged practice. If anyone can 
> provide some pointers on what to look out for, I could invest some time in 
> creating a patch.
>
>
> Thanks,
> Friso
>
>

Reply via email to