Thanks for the report Bryan, I'll try your little program against one of our 0.90.1 cluster that has similar hardware.
J-D On Sun, Mar 13, 2011 at 1:48 PM, Bryan Keller <brya...@gmail.com> wrote: > If interested, I wrote a small program that demonstrates the problem > (http://vancameron.net/HBaseInsert.zip). It uses Gradle, so you'll need that. > To run, enter "gradle run". > > On Mar 13, 2011, at 12:14 AM, Bryan Keller wrote: > >> I am using the Java client API to write 10,000 rows with about 6000 columns >> each, via 8 threads making multiple calls to the HTable.put(List<Put>) >> method. I start with an empty table with one column family and no regions >> pre-created. >> >> With compression turned off, I am seeing very stable performance. At the >> start there are a couple of 10-20sec pauses where all insert threads are >> blocked during a region split. Subsequent splits do not cause all of the >> threads to block, presumably because there are more regions so no one region >> split blocks all inserts. GCs for HBase during the insert is not a major >> problem (6k/55sec). >> >> When using either LZO or gzip compression, however, I am seeing frequent and >> long pauses, sometimes around 20 sec but often over 80 seconds in my test. >> During these pauses all 8 of the threads writing to HBase are blocked. The >> pauses happen throughout the insert process. GCs are higher in HBase when >> using compression (60k, 4min), but it doesn't seem enough to explain these >> pauses. Overall performance obviously suffers dramatically as a result >> (about 2x slower). >> >> I have tested this in different configurations (single node, 4 nodes) with >> the same result. I'm using HBase 0.90.1 (CDH3B4), Sun/Oracle Java 1.6.0_24, >> CentOS 5.5, Hadoop LZO 0.4.10 from Cloudera. Machines have 12 cores and 24 >> gb of RAM. Settings are pretty much default, nothing out of the ordinary. I >> tried playing around with region handler count and memstore settings, but >> these had no effect. >> > >