Thanks for the report Bryan, I'll try your little program against one
of our 0.90.1 cluster that has similar hardware.

J-D

On Sun, Mar 13, 2011 at 1:48 PM, Bryan Keller <brya...@gmail.com> wrote:
> If interested, I wrote a small program that demonstrates the problem 
> (http://vancameron.net/HBaseInsert.zip). It uses Gradle, so you'll need that. 
> To run, enter "gradle run".
>
> On Mar 13, 2011, at 12:14 AM, Bryan Keller wrote:
>
>> I am using the Java client API to write 10,000 rows with about 6000 columns 
>> each, via 8 threads making multiple calls to the HTable.put(List<Put>) 
>> method. I start with an empty table with one column family and no regions 
>> pre-created.
>>
>> With compression turned off, I am seeing very stable performance. At the 
>> start there are a couple of 10-20sec  pauses where all insert threads are 
>> blocked during a region split. Subsequent splits do not cause all of the 
>> threads to block, presumably because there are more regions so no one region 
>> split blocks all inserts. GCs for HBase during the insert is not a major 
>> problem (6k/55sec).
>>
>> When using either LZO or gzip compression, however, I am seeing frequent and 
>> long pauses, sometimes around 20 sec but often over 80 seconds in my test. 
>> During these pauses all 8 of the threads writing to HBase are blocked. The 
>> pauses happen throughout the insert process. GCs are higher in HBase when 
>> using compression (60k, 4min), but it doesn't seem enough to explain these 
>> pauses. Overall performance obviously suffers dramatically as a result 
>> (about 2x slower).
>>
>> I have tested this in different configurations (single node, 4 nodes) with 
>> the same result. I'm using HBase 0.90.1 (CDH3B4), Sun/Oracle Java 1.6.0_24, 
>> CentOS 5.5, Hadoop LZO 0.4.10 from Cloudera. Machines have 12 cores and 24 
>> gb of RAM. Settings are pretty much default, nothing out of the ordinary. I 
>> tried playing around with region handler count and memstore settings, but 
>> these had no effect.
>>
>
>

Reply via email to