On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <[email protected]> wrote: > One thing I am not clear about major compaction is that for the regions with > a single map file, > will hbase actually load it and remove the records older than TTL ?
Major compactions will run even if only one file IFF this file is not already the product of a major compaction (files that have been major compacted get a marker in their metadata so next time a major compaction runs we'll skip the file) AND the time since the last major compaction is < TTL (See http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743). The RegionServer runs a Major Compaction checking thread... it runs on a period. So, it should be doing what you want (if a little crudely given its waiting TTL before rechecking if already major compacted. We could make improvement by looking at oldest timestamp every time we run the major compaction check. St.Ack
