Major compactions can still be useful to improve locality - could we add a condition to check for that too?
On Mon, Sep 9, 2013 at 10:41 PM, lars hofhansl <[email protected]> wrote: > Interesting. I guess we could add a check to avoid major compactions if > (1) no TTL is set or we can show that all data is newer and (2) there's > only one file (3) and there are no delete markers. All of these can be > cheaply checked with some HFile metadata (we might have all data needed > already). > > > That would take care of both of your scenarios. > > -- Lars > ________________________________ > From: Premal Shah <[email protected]> > To: user <[email protected]> > Sent: Monday, September 9, 2013 9:02 PM > Subject: Tables gets Major Compacted even if they haven't changed > > > Hi, > We have a bunch on tables in our HBase cluster. We have a script which > makes sure all of them get Major Compacted once every 2 days. There are 2 > things I'm observing > > 1) Table X has not updated in a month. We have not inserted, updated or > deleted data. However, it still major compacts every 2 days. All the > regions in this table have only 1 store file. > > 2) Table Y has a few regions where the rowkey is essentially a timestamp. > So, we only write to 1 region at a time. Over time, the region splits, and > then we write the one of the split regions. Now, whenever we major compact > the table, all regions get major compacted. Only 1 region has more than 1 > store file, every other region has exactly once. > > Is there a way to avoid compaction of regions that have not changed? > > We are using HBase 0.94.11 > > -- > Regards, > Premal Shah. >
