Re: Long client pauses with compression

Stack Tue, 15 Mar 2011 09:06:33 -0700

Sounds like a nice feature to have and to ship as the default.
St.Ack


On Tue, Mar 15, 2011 at 1:53 AM, Andrew Purtell <apurt...@apache.org> wrote:
> We have a separate compression setting for major compaction vs store files 
> written during minor compaction (for background/archival apps).
>
> Why not a separate compression setting for flushing? I.e. none?
>
>
> --- On Mon, 3/14/11, Jean-Daniel Cryans <jdcry...@apache.org> wrote:
>
>> From: Jean-Daniel Cryans <jdcry...@apache.org>
>> Subject: Re: Long client pauses with compression
>> To: user@hbase.apache.org
>> Date: Monday, March 14, 2011, 7:48 PM
>> For the reasons I gave above... the puts are sometimes blocked on the
>> memstores which are blocked by the flusher thread which is blocked
>> because there's too many files to compact because the compactor is
>> given too many small files to compact and has to compact the same data
>> a bunch of times.
>>
>> Also, I may not have been clear, but HBase doesn't compress data in
>> memory. This means, like I showed, that the 64MB that lives in memory
>> becomes a 6MB file after compression (using GZ). You pack a lot more
>> data into the same region, but performance is achieved by being able
>> to write big files (else we wouldn't be waiting to get to 64MB before
>> flushing).
>>
>> Without compression the files are much bigger and don't need as much
>> compaction since there's so few of them, and then it splits early, to
>> the expense of IO.
>>
>> If we were able to compress directly in memory and output the file as
>> is, then we would be able to carry a lot more data in the MemStores
>> and flush bigger files to disk... but it's not the case.
>>
>> Todd Lipcon once described this situation as HBase basically saying
>> "ok you can put as fast as you can... oh wait stop stop stop that's
>> too much... ok you can start inserting again... oh wait no that's too
>> much" etc etc. HBase could do a better job at slowing down inserts
>> when detecting this situation (or something like that), BTW this jira
>> has been opened to track this issue
>> https://issues.apache.org/jira/browse/HBASE-2981
>>
>> J-D
>>
>> On Mon, Mar 14, 2011 at 7:04 PM, Bryan Keller <brya...@gmail.com>
>> wrote:
>> > I changed the settings as described below:
>> >
>> > hbase.hstore.blockingStoreFiles=20
>> > hbase.hregion.memstore.block.multiplier=4
>> > MAX_FILESIZE=512mb
>> > MEMSTORE_FLUSHSIZE=128mb
>> >
>> > I also created the table with 6 regions initially.
>> Before I wasn't creating any regions initially. I needed to
>> make all of these changes together to entirely eliminate the
>> very long pauses. Now there are no pauses much longer than a
>> second.
>> >
>> > Thanks much for the help. I am still not entirely sure
>> why compression seems to expose this problem, however.
>> >
>> >
>> > On Mar 14, 2011, at 11:54 AM, Jean-Daniel Cryans
>> wrote:
>> >
>> >> Alright so here's a preliminary report:
>> >>
>> >> - No compression is stable for me too, short
>> pauses.
>> >> - LZO gave me no problems either, generally faster
>> than no compression.
>> >> - GZ initially gave me weird results, but I
>> quickly saw that I forgot
>> >> to copy over the native libs from the hadoop
>> folder so my logs were
>> >> full of:
>> >>
>> >> 2011-03-14 10:20:29,624 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,626 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,628 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,630 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,632 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,634 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >> 2011-03-14 10:20:29,636 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> >> Got brand-new compressor
>> >>
>> >> I copied the libs over, bounced the region
>> servers, and the
>> >> performance was much more stable until a point
>> where I got a 20
>> >> seconds pause, and looking at the logs I see:
>> >>
>> >> 2011-03-14 10:31:17,625 WARN
>> >>
>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>> Region
>> >>
>> test,,1300127266461.9d0eb095b77716c22cd5c78bb503c744. has
>> too many
>> >> store files; delaying flush up to 90000ms
>> >>
>> >> (our config sets the block at 20 store files
>> instead of the default
>> >> which is around 12 IIRC)
>> >>
>> >> Quickly followed by a bunch of:
>> >>
>> >> 2011-03-14 10:31:26,757 INFO
>> >> org.apache.hadoop.hbase.regionserver.HRegion:
>> Blocking updates for
>> >> 'IPC Server handler 20 on 60020' on region
>> >>
>> test,,1300127266461.9d0eb095b77716c22cd5c78bb503c744.:
>> memstore size
>> >> 285.6m is >= than blocking 256.0m size
>> >>
>> >> (our settings make it that we won't block on
>> memstores until 4x their
>> >> sizes, in your case you may see a 2x blocking
>> factor so 128MB which is
>> >> default)
>> >>
>> >> The reason is that our memstores, once flushed,
>> occupy a very small
>> >> space, consider this:
>> >>
>> >> 2011-03-14 10:31:16,606 INFO
>> >> org.apache.hadoop.hbase.regionserver.Store: Added
>> >>
>> hdfs://sv2borg169:9000/hbase/test/9d0eb095b77716c22cd5c78bb503c744/test/420552941380451032,
>> >> entries=216000, sequenceid=70556635737,
>> memsize=64.3m, filesize=6.0m
>> >>
>> >> It means that it will create tiny files of ~6MB
>> and the compactor will
>> >> spend all it's time merging those files until a
>> point where HBase must
>> >> stop inserting in order to not blow its available
>> memory. Thus, the
>> >> same data will get rewritten a couple of times.
>> >>
>> >> Normally, and by that I mean a system where you're
>> not just trying to
>> >> insert data ASAP but where most of your workload
>> is made up of reads,
>> >> this works well as the memstores are filled much
>> more slowly and
>> >> compactions happen at a normal pace.
>> >>
>> >> If you search around the interwebs for tips on
>> speeding up HBase
>> >> inserts, you'll often see the configs I referred
>> to earlier:
>> >>
>> >>
>>  <name>hbase.hstore.blockingStoreFiles</name>
>> >>  <value>20</value>
>> >> and
>> >>
>>  <name>hbase.hregion.memstore.block.multiplier</name>
>> >>  <value>4</value>
>> >>
>> >> They should work pretty well for most use cases
>> that are made of heavy
>> >> writes given that the region servers have enough
>> heap (eg more than 3
>> >> or 4GB). You should also consider setting
>> MAX_FILESIZE to >1GB to
>> >> limit the number of regions and MEMSTORE_FLUSHSIZE
>> to >128MB to flush
>> >> bigger files.
>> >>
>> >> Hope this helps,
>> >>
>> >> J-D
>> >>
>> >> On Mon, Mar 14, 2011 at 10:29 AM, Jean-Daniel
>> Cryans
>> >> <jdcry...@apache.org>
>> wrote:
>> >>> Thanks for the report Bryan, I'll try your
>> little program against one
>> >>> of our 0.90.1 cluster that has similar
>> hardware.
>> >>>
>> >>> J-D
>> >>>
>> >>> On Sun, Mar 13, 2011 at 1:48 PM, Bryan Keller
>> <brya...@gmail.com>
>> wrote:
>> >>>> If interested, I wrote a small program
>> that demonstrates the problem (http://vancameron.net/HBaseInsert.zip). It 
>> uses Gradle,
>> so you'll need that. To run, enter "gradle run".
>> >>>>
>> >>>> On Mar 13, 2011, at 12:14 AM, Bryan Keller
>> wrote:
>> >>>>
>> >>>>> I am using the Java client API to
>> write 10,000 rows with about 6000 columns each, via 8
>> threads making multiple calls to the
>> HTable.put(List<Put>) method. I start with an empty
>> table with one column family and no regions pre-created.
>> >>>>>
>> >>>>> With compression turned off, I am
>> seeing very stable performance. At the start there are a
>> couple of 10-20sec  pauses where all insert threads are
>> blocked during a region split. Subsequent splits do not
>> cause all of the threads to block, presumably because there
>> are more regions so no one region split blocks all inserts.
>> GCs for HBase during the insert is not a major problem
>> (6k/55sec).
>> >>>>>
>> >>>>> When using either LZO or gzip
>> compression, however, I am seeing frequent and long pauses,
>> sometimes around 20 sec but often over 80 seconds in my
>> test. During these pauses all 8 of the threads writing to
>> HBase are blocked. The pauses happen throughout the insert
>> process. GCs are higher in HBase when using compression
>> (60k, 4min), but it doesn't seem enough to explain these
>> pauses. Overall performance obviously suffers dramatically
>> as a result (about 2x slower).
>> >>>>>
>> >>>>> I have tested this in different
>> configurations (single node, 4 nodes) with the same result.
>> I'm using HBase 0.90.1 (CDH3B4), Sun/Oracle Java 1.6.0_24,
>> CentOS 5.5, Hadoop LZO 0.4.10 from Cloudera. Machines have
>> 12 cores and 24 gb of RAM. Settings are pretty much default,
>> nothing out of the ordinary. I tried playing around with
>> region handler count and memstore settings, but these had no
>> effect.
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >
>> >
>>
>
>
>
>

Re: Long client pauses with compression

Reply via email to