Hi, Stack:
Thanks for the explanation. I looked at the code and it seems that the old region should get compacted and data older than TTL will get removed. I will do a test with a table with 10 min TTL , and insert several regions and wait for 1 day, and see if old records will indeed get removed or not.

Jimmy.

--------------------------------------------------
From: "Stack" <[email protected]>
Sent: Wednesday, September 15, 2010 9:53 PM
To: <[email protected]>
Subject: Re: hbase doesn't delete data older than TTL in old regions

On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <[email protected]> wrote:
One thing I am not clear about major compaction is that for the regions with
a single map file,
will hbase actually load it and remove the records older than TTL ?

Major compactions will run even if only one file IFF this file is not
already the product of a major compaction (files that have been major
compacted get a marker in their metadata so next time a major
compaction runs we'll skip the file) AND the time since the last major
compaction is < TTL (See
http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).

The RegionServer runs a Major Compaction checking thread... it runs on a period.

So, it should be doing what you want (if a little crudely given its
waiting TTL before rechecking if already major compacted.

We could make improvement by looking at oldest timestamp every time we
run the major compaction check.

St.Ack

Reply via email to