Re: major compaction time constantly growing

Jean-Daniel Cryans Thu, 22 Sep 2011 10:36:01 -0700

Inline.

J-D

On Wed, Sep 21, 2011 at 11:31 AM, Oleg Ruchovets <[email protected]> wrote:
> Hi ,
>
>  Our environment
> hbase 90.2 (10 machine)
>    We have 10 machine grid:
>    master has 48G ram
>    slaves machine has 16G ram.
>    Region Server process has 4G ram
>    Zookeeper process has 2G ram
>    We have 4map/2reducer per machine
>
> We write from m/r job to hbase (2 jobs a day).
>   1) We running major compaction manually once a day
>   2) We increases regions size to prevent automatic split
>   3) We didn't use WAL (WAL is false).
>
> We started running major compaction 3 month before on dayly basis. Major
> compaction execution time from the very begining was ~1 hour. Every next day
> major compaction takes more time then previous and today it took 8.5 hours.
>
>   1) Is it correct behaviour that major compaction time constantly
> growing?

If your dataset is growing, yes. When you major compact, your rewrite
all the data.

>   2) What is the best practice to resolve such behaviour , simply in some
> time we will have major compaction which will take more then 24 hours(but we
> run it on daily basis).

You might want to find the root cause. If your data is growing, then
at some point you might want to grow the cluster too.

You might also want to check if you need to major compact too, like if
you don't delete/overwrite/TTL your values a lot then you probably
don't need more than a weekly major compaction run.

>   3) If it is not correct behavior  how can we debug major compaction
> process to verify what could cause a problem

See if some machine takes more time than the others, check that your
disks are not spamming kern.log, basically look for signs of hardware
problems.

>
> Thanks in advance
> Oleg.
>

Re: major compaction time constantly growing

Reply via email to