Re: big compaction queue size

Gaojinchao Tue, 06 Sep 2011 19:19:47 -0700

Hi J-D
Should we can give a formula about active regions per node and up to book ?  I 
think many people encounter the same problem.


I think the formula is:
If( (Hlognumber*hdfsblock) > (HBASE_HEAPSIZE *memstore.lowerLimit) )
   Active Regions  = (HBASE_HEAPSIZE *memstore.lowerLimit )/( flush.size / 
(2~3))  
Else
   Active Regions  =  (Hlognumber*hdfsblock)/ (flush.size / (2~3))


If I am wrong, please correct. Thanks.


-----邮件原件-----
发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel Cryans
发送时间: 2011年9月7日 1:19
收件人: [email protected]
主题: Re: big compaction queue size

Inline.

J-D

> We're running a 33-regionserver hbase cluster on top of cdh3u0 suites. On
> average, we have 2400 regions hosted
> on each regionserver. (hbase.hregion.max.filesize is 1.5GB, and we have
> value size up to 4MB per object).

2400 region is just too many, if you are importing data at a high rate
(which might be the case with such fat values) and well distributed
among those regions then you will be force flushing tons of small
files everytime a region server needs to clean a log file.

Try to set it to 100 per node, then disable splitting, and also set a
bigger flush size on your table.

>
> I check the log of regionserver, it seems like the compaction queue size is
> about 1700, and every the compaction action
> takes about 1 minute, and more over, most of the compaction are triggered to
> a major one.
>
> My question are,
>
> 1. Would this cause the performance degradation? It seems like "GET" action
> in the interval that two minutes before/after
> the compaction takes much longer time than usual. I thought the compaction
> is a asynchronous operation.

It's async, but still uses IO resources which may impact latency.
Compacting also creates new blocks so the block cache is churning
through invalid blocks.

> 2. Any issue would cause long-term compaction?

It's more like a systemic issue, everything influences everything else.

> 3. It seems like HBASE-1476 is going to implement multi-threaded compaction,
> I guess it would help to reduce
> the size of compaction queue.

But also generate more IO, in your case 1476 would probably not help at all.

Re: big compaction queue size

Reply via email to