Re: problem with LZO compressor on write only loads

Friso van Vollenhoven Mon, 03 Jan 2011 23:41:01 -0800

Nothing out of the ordinary. HFile blocks are default 64KB. Max file size is 
1GB. Writes are without WAL. Client side write buffer is larger than default at 
16MB. The mem store flush size is 128M. Compaction threshold and blocking store 
files are 5 and 9 respectively. All the rest is defaults.


It could be that the writes are a bit poorly distributed in the beginning of 
the job. The tables are created with pre-created regions, but it still took one 
or two splits to get it nicely distributed across all machines last time I ran 
it (which was on 0.89 with an old / ancient LZO version).

What I also notice is that with 0.90 HBase reports close to an order of 
magnitude less requests per second (in the master UI). I used to do about 300K 
req/s and now it rarely gets above 40K. I am guessing that potentially swapping 
and the OS not having any memory available to buffers isn't helping here, but 
it's still significant. But if it is allocating 64M blocks a lot where it 
shouldn't, then that explains a bit as well.


Friso



On 4 jan 2011, at 01:54, Todd Lipcon wrote:

> Fishy. Are your cells particularly large? Or have you tuned the HFile block
> size at all?
> 
> -Todd
> 
> On Mon, Jan 3, 2011 at 2:15 PM, Friso van Vollenhoven <
> [email protected]> wrote:
> 
>> I tried it, but it doesn't seem to help. The RS processes grow to 30Gb in
>> minutes after the job started.
>> 
>> Any ideas?
>> 
>> 
>> Friso
>> 
>> 
>> 
>> On 3 jan 2011, at 19:18, Todd Lipcon wrote:
>> 
>>> Hi Friso,
>>> 
>>> Which OS are you running? Particularly, which version of glibc?
>>> 
>>> Can you try running with the environment variable MALLOC_ARENA_MAX=1 set?
>>> 
>>> Thanks
>>> -Todd
>>> 
>>> On Mon, Jan 3, 2011 at 8:15 AM, Friso van Vollenhoven <
>>> [email protected]> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I seem to run into a problem that occurs when using LZO compression on a
>>>> heavy write only load. I am using 0.90 RC1 and, thus, the LZO compressor
>>>> code that supports the reinit() method (from Kevin Weil's github,
>> version
>>>> 0.4.8). There are some more Hadoop LZO incarnations, so I am pointing my
>>>> question to this list.
>>>> 
>>>> It looks like the compressor uses direct byte buffers to store the
>> original
>>>> and compressed bytes in memory, so the native code can work with it
>> without
>>>> the JVM having to copy anything around. The direct buffers are possibly
>>>> reused after a reinit() call, but will often be newly created in the
>> init()
>>>> method, because the existing buffer can be the wrong size for reusing.
>> The
>>>> latter case will leave the previously used buffers by the compressor
>>>> instance eligible for garbage collection. I think the problem is that
>> this
>>>> collection never occurs (in time), because the GC does not consider it
>>>> necessary yet. The GC does not know about the native heap and based on
>> the
>>>> state of the JVM heap, there is no reason to finalize these objects yet.
>>>> However, direct byte buffers are only freed in the finalizer, so the
>> native
>>>> heap keeps growing. On write only loads, a full GC will rarely happen,
>>>> because the max heap will not grow far beyond the mem stores (no block
>> cache
>>>> is used). So what happens is that the machine starts using swap before
>> the
>>>> GC will ever clean up the direct byte buffers. I am guessing that
>> without
>>>> the reinit() support, the buffers were collected earlier because the
>>>> referring objects would also be collected every now and then or things
>> would
>>>> perhaps just never promote to an older generation.
>>>> 
>>>> When I do a pmap on a running RS after it has grown to some 40Gb
>> resident
>>>> size (with a 16Gb heap), it will show a lot of near 64M anon blocks
>>>> (presumably native heap). I show this before with the 0.4.6 version of
>>>> Hadoop LZO, but that was under normal load. After that I went back to a
>>>> HBase version that does not require the reinit(). Now I am on 0.90 with
>> the
>>>> new LZO, but never did a heavy load like this one with that, until
>> now...
>>>> 
>>>> Can anyone with a better understanding of the LZO code confirm that the
>>>> above could be the case? If so, would it be possible to change the LZO
>>>> compressor (and decompressor) to use maybe just one fixed size buffer
>> (they
>>>> all appear near 64M anyway) or possibly reuse an existing buffer also
>> when
>>>> it is not the exact required size but just large enough to make do?
>> Having
>>>> short lived direct byte buffers is apparently a discouraged practice. If
>>>> anyone can provide some pointers on what to look out for, I could invest
>>>> some time in creating a patch.
>>>> 
>>>> 
>>>> Thanks,
>>>> Friso
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>> 
>> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: problem with LZO compressor on write only loads

Reply via email to