I'm doing a test now w/o any GZ compression enabled and I am seeing the same
pauses in loading... any more ideas? I will try dropping my region size down
to 256 MB next. Currently I cannot get any sustained writing via thrift for
more than a few seconds before it all pauses.

-chris


On Tue, Jan 11, 2011 at 10:18 AM, Chirstopher Tarnas <[email protected]> wrote:

> Hi Stack,
>
> Thanks for taking a look. I think I caught a regionserver compacting:
>
> http://pastebin.com/y9BQaVeJ
>
> http://pastebin.com/ZMxwEX5j
>
> thanks again,
> -chris
>
> On Mon, Jan 10, 2011 at 1:52 PM, Stack <[email protected]> wrote:
>
>> Odd.   Mind thread dumping the regionserver a few times and
>> pastebining it during a compaction so we can see where its spending
>> time?  (Your compaction numbers are bad).
>>
>> St.Ack
>>
>> On Fri, Jan 7, 2011 at 11:07 PM, Chris Tarnas <[email protected]> wrote:
>> > Thanks in advance for any help. I've been quite pleased with Hbase for
>> this current project and until this problem it has worked quite well.
>> >
>> > Test cluster setup is CDH3b3 on a 7 nodes:
>> > 5 data nodes with 48GB RAM, 8 cores, 4 disks,
>> > 2 masters with 8 cores, 2 disks 24GB RAM for master/zookeeper/namenode
>> >
>> > My hbase.hregion.max.filesize is set to 1GB, ulimit files to 32k and
>> xceivers to 4096, hbase heap is at 8GB.
>> >
>> > I'm testing out using GZ compression on two tables, each is currently
>> still only one region. My tests runs fine when compression is off so this is
>> definitely related to compression. When I start loading data (via thrift,
>> many clients) it loads great for a while then the region servers slow to
>> crawl. When this happens the two regionservers that are hosting the tables
>> use ~ 110-160% CPU and block writes. One regionserver has occasional bursts
>> of activity but mostly is very repetitive, here is a sample of the log:
>> >
>> > http://pastebin.com/WSc8aZFQ
>> >
>> > The other active regionserver looks to be continuously compacting:
>> >
>> > http://pastebin.com/3ifVKaX2
>> >
>> >
>> > The master log is quite boring with this being repeated:
>> >
>> > 2011-01-08 00:48:58,419 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.rootScanner scanning meta region {server: 10.56.24.8:60020,
>> regionname: -ROOT-,,0.70236052, startKey: <>}
>> > 2011-01-08 00:48:58,424 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
>> 10.56.24.8:60020, regionname: -ROOT-,,0.70236052, startKey: <>} complete
>> > 2011-01-08 00:48:58,444 INFO
>> org.apache.hadoop.hbase.master.ServerManager: 5 region servers, 0 dead,
>> average load 1.6
>> > 2011-01-08 00:49:04,810 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.metaScanner scanning meta region {server: 10.56.24.7:60020,
>> regionname: .META.,,1.1028785192, startKey: <>}
>> > 2011-01-08 00:49:04,820 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> RegionManager.metaScanner scan of 6 row(s) of meta region {server:
>> 10.56.24.7:60020, regionname: .META.,,1.1028785192, startKey: <>}
>> complete
>> > 2011-01-08 00:49:04,820 INFO org.apache.hadoop.hbase.master.BaseScanner:
>> All 1 .META. region(s) scanned
>> >
>> >
>> > At this point loading slows to a trickle (requests are 0 in the web ui),
>> I can see infrequent bursts of loading but very small amounts. Each table
>> only has one region (and there are only two other tables, each also with
>> only one region).
>> >
>> > I've compiled and tested the native GZ compression codecs on the nodes
>> and the nodes have plenty of CPU, IO and memory available and no swapping.
>> Any suggestions? Please let me know if you need any other info.
>> >
>> > thanks!
>> > -chris
>>
>
>

Reply via email to