Hi, Stack:
Thanks for the explanation. I looked at the code and it seems that the
old region should get compacted
and data older than TTL will get removed. I will do a test with a table with
10 min TTL , and insert several
regions and wait for 1 day, and see if old records will indeed get removed
or not.
Jimmy.
--------------------------------------------------
From: "Stack" <[email protected]>
Sent: Wednesday, September 15, 2010 9:53 PM
To: <[email protected]>
Subject: Re: hbase doesn't delete data older than TTL in old regions
On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <[email protected]>
wrote:
One thing I am not clear about major compaction is that for the regions
with
a single map file,
will hbase actually load it and remove the records older than TTL ?
Major compactions will run even if only one file IFF this file is not
already the product of a major compaction (files that have been major
compacted get a marker in their metadata so next time a major
compaction runs we'll skip the file) AND the time since the last major
compaction is < TTL (See
http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).
The RegionServer runs a Major Compaction checking thread... it runs on a
period.
So, it should be doing what you want (if a little crudely given its
waiting TTL before rechecking if already major compacted.
We could make improvement by looking at oldest timestamp every time we
run the major compaction check.
St.Ack