RE: Compactions in busy system(s)

Iulia Zidaru Wed, 06 Apr 2011 02:59:01 -0700

Thank you for your detailed explanation and also for the code reference. It 
definitely helps a lot!
Regards,
Iulia
________________________________________
From: [email protected] [[email protected]] On Behalf Of Stack 
[[email protected]]
Sent: Tuesday, April 05, 2011 7:17 PM
To: [email protected]
Cc: Iulia Zidaru
Subject: Re: Compactions in busy system(s)

See below.

On Tue, Apr 5, 2011 at 3:45 AM, Iulia Zidaru <[email protected]> wrote:
> It is important to run major compaction when we have a lot of deleted data,
> as it removes the "marked as deleted" flags.

Yes, major compaction removes the garbage.

As to its being 'important', as long as the 'minor' compactions are
running and you don't mind a bit of bloat, then you should be able to
put off running the major until you have a trough in your cluster
usage.

By default major compactions run once a day.  They have a tendency to
start running just as site is experiencing peak load.  Lots of us
disable automatic major compactions and run them manually (see Michael
Segels' note for how).

> There are also the "flush" and "minor compaction" operations associated with
> the writing on disk. I understand that in minor compaction many files
> resulted from flush operations are written in only one file. What is not
> very clear is whether major compaction does the same operation (and so it
> can be skipped if no deletes are in the system) or there is also a
> particular operation which is not done in minor compaction and skipping it
> may affect the performance or volume.
>

Minors will usually pick up a couple of the smaller adjacent files and
rewrite them as one.  Minors do not drop deletes or expired cells;
this is for majors to do.  Sometimes a minor will pick up all the
files in the store and in this case it actually promotes itself to
being a major compaction.  How the minor picks files to compact is
explained in a pretty ascii diagram in the src.  See here:
http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#836
 After a major compaction runs, you have a single file only.  This
will help performance usually.  But major compactions rewrite all of
the stores data and on a loaded system, this may not be tenable; the
majors will usually have to be managed.

> An other thing that I'd like you to help me clarifying is if major
> compaction on all dataset is the sum of major compaction of all regions. If
> so, it is possible to major compact only some regions at a time, and other
> regions at other time.

Yes (see Michael's note).

> I also don't understand well if it is possible for
> the system to merge a region with less data with other region and if it
> does, which of the mentioned operations might affect the good system
> behavior(i.e. what NOT to do).
>

We do not currently have an online merge.  We have an offline merge --
which is close to useless in the scheme of things.  We need to fix the
online merge for the case where a user started w/ configuration "small
regions" but then as their data grew, they figured they needed
configuration for "large regions"; the merge would aggregate up a
bunch of small regions (less regions is usually better).

> The last point is regarding the files in HDFS (this might affect the
> volume). When is the data deleted from HDFS(in minor and major compaction)?
> Are the files deleted when a compaction is performed or they are only marked
> as deleted?
>

They are deleted after the new file is created and swapped in to
replace the old.

Hope this helps.  Keep asking questions,
St.Ack

RE: Compactions in busy system(s)

Reply via email to