I changed the settings as described below:

hbase.hstore.blockingStoreFiles=20
hbase.hregion.memstore.block.multiplier=4
MAX_FILESIZE=512mb
MEMSTORE_FLUSHSIZE=128mb

I also created the table with 6 regions initially. Before I wasn't creating any 
regions initially. I needed to make all of these changes together to entirely 
eliminate the very long pauses. Now there are no pauses much longer than a 
second.

Thanks much for the help. I am still not entirely sure why compression seems to 
expose this problem, however.


On Mar 14, 2011, at 11:54 AM, Jean-Daniel Cryans wrote:

> Alright so here's a preliminary report:
> 
> - No compression is stable for me too, short pauses.
> - LZO gave me no problems either, generally faster than no compression.
> - GZ initially gave me weird results, but I quickly saw that I forgot
> to copy over the native libs from the hadoop folder so my logs were
> full of:
> 
> 2011-03-14 10:20:29,624 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,626 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,628 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,630 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,632 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,634 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 2011-03-14 10:20:29,636 INFO org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor
> 
> I copied the libs over, bounced the region servers, and the
> performance was much more stable until a point where I got a 20
> seconds pause, and looking at the logs I see:
> 
> 2011-03-14 10:31:17,625 WARN
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region
> test,,1300127266461.9d0eb095b77716c22cd5c78bb503c744. has too many
> store files; delaying flush up to 90000ms
> 
> (our config sets the block at 20 store files instead of the default
> which is around 12 IIRC)
> 
> Quickly followed by a bunch of:
> 
> 2011-03-14 10:31:26,757 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 20 on 60020' on region
> test,,1300127266461.9d0eb095b77716c22cd5c78bb503c744.: memstore size
> 285.6m is >= than blocking 256.0m size
> 
> (our settings make it that we won't block on memstores until 4x their
> sizes, in your case you may see a 2x blocking factor so 128MB which is
> default)
> 
> The reason is that our memstores, once flushed, occupy a very small
> space, consider this:
> 
> 2011-03-14 10:31:16,606 INFO
> org.apache.hadoop.hbase.regionserver.Store: Added
> hdfs://sv2borg169:9000/hbase/test/9d0eb095b77716c22cd5c78bb503c744/test/420552941380451032,
> entries=216000, sequenceid=70556635737, memsize=64.3m, filesize=6.0m
> 
> It means that it will create tiny files of ~6MB and the compactor will
> spend all it's time merging those files until a point where HBase must
> stop inserting in order to not blow its available memory. Thus, the
> same data will get rewritten a couple of times.
> 
> Normally, and by that I mean a system where you're not just trying to
> insert data ASAP but where most of your workload is made up of reads,
> this works well as the memstores are filled much more slowly and
> compactions happen at a normal pace.
> 
> If you search around the interwebs for tips on speeding up HBase
> inserts, you'll often see the configs I referred to earlier:
> 
>  <name>hbase.hstore.blockingStoreFiles</name>
>  <value>20</value>
> and
>  <name>hbase.hregion.memstore.block.multiplier</name>
>  <value>4</value>
> 
> They should work pretty well for most use cases that are made of heavy
> writes given that the region servers have enough heap (eg more than 3
> or 4GB). You should also consider setting MAX_FILESIZE to >1GB to
> limit the number of regions and MEMSTORE_FLUSHSIZE to >128MB to flush
> bigger files.
> 
> Hope this helps,
> 
> J-D
> 
> On Mon, Mar 14, 2011 at 10:29 AM, Jean-Daniel Cryans
> <jdcry...@apache.org> wrote:
>> Thanks for the report Bryan, I'll try your little program against one
>> of our 0.90.1 cluster that has similar hardware.
>> 
>> J-D
>> 
>> On Sun, Mar 13, 2011 at 1:48 PM, Bryan Keller <brya...@gmail.com> wrote:
>>> If interested, I wrote a small program that demonstrates the problem 
>>> (http://vancameron.net/HBaseInsert.zip). It uses Gradle, so you'll need 
>>> that. To run, enter "gradle run".
>>> 
>>> On Mar 13, 2011, at 12:14 AM, Bryan Keller wrote:
>>> 
>>>> I am using the Java client API to write 10,000 rows with about 6000 
>>>> columns each, via 8 threads making multiple calls to the 
>>>> HTable.put(List<Put>) method. I start with an empty table with one column 
>>>> family and no regions pre-created.
>>>> 
>>>> With compression turned off, I am seeing very stable performance. At the 
>>>> start there are a couple of 10-20sec  pauses where all insert threads are 
>>>> blocked during a region split. Subsequent splits do not cause all of the 
>>>> threads to block, presumably because there are more regions so no one 
>>>> region split blocks all inserts. GCs for HBase during the insert is not a 
>>>> major problem (6k/55sec).
>>>> 
>>>> When using either LZO or gzip compression, however, I am seeing frequent 
>>>> and long pauses, sometimes around 20 sec but often over 80 seconds in my 
>>>> test. During these pauses all 8 of the threads writing to HBase are 
>>>> blocked. The pauses happen throughout the insert process. GCs are higher 
>>>> in HBase when using compression (60k, 4min), but it doesn't seem enough to 
>>>> explain these pauses. Overall performance obviously suffers dramatically 
>>>> as a result (about 2x slower).
>>>> 
>>>> I have tested this in different configurations (single node, 4 nodes) with 
>>>> the same result. I'm using HBase 0.90.1 (CDH3B4), Sun/Oracle Java 
>>>> 1.6.0_24, CentOS 5.5, Hadoop LZO 0.4.10 from Cloudera. Machines have 12 
>>>> cores and 24 gb of RAM. Settings are pretty much default, nothing out of 
>>>> the ordinary. I tried playing around with region handler count and 
>>>> memstore settings, but these had no effect.
>>>> 
>>> 
>>> 
>> 

Reply via email to