You could change hbase.hregion.majorcompaction to be less than one day so you don't have to wait so long. Make sure DEBUG is enabled (It should be by default). With DEBUG, you'll be able to see compactions running. Log will include type of compaction run.
Thanks for testing, St.Ack On Wed, Sep 15, 2010 at 10:43 PM, Jinsong Hu <[email protected]> wrote: > Hi, Stack: > Thanks for the explanation. I looked at the code and it seems that the old > region should get compacted > and data older than TTL will get removed. I will do a test with a table with > 10 min TTL , and insert several > regions and wait for 1 day, and see if old records will indeed get removed > or not. > > Jimmy. > > -------------------------------------------------- > From: "Stack" <[email protected]> > Sent: Wednesday, September 15, 2010 9:53 PM > To: <[email protected]> > Subject: Re: hbase doesn't delete data older than TTL in old regions > >> On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <[email protected]> >> wrote: >>> >>> One thing I am not clear about major compaction is that for the regions >>> with >>> a single map file, >>> will hbase actually load it and remove the records older than TTL ? >> >> Major compactions will run even if only one file IFF this file is not >> already the product of a major compaction (files that have been major >> compacted get a marker in their metadata so next time a major >> compaction runs we'll skip the file) AND the time since the last major >> compaction is < TTL (See >> >> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743). >> >> The RegionServer runs a Major Compaction checking thread... it runs on a >> period. >> >> So, it should be doing what you want (if a little crudely given its >> waiting TTL before rechecking if already major compacted. >> >> We could make improvement by looking at oldest timestamp every time we >> run the major compaction check. >> >> St.Ack >> >
