I am more concerned with CompactionPolicy available that allows application to manipulate a bit how compaction should go... It looks like there is newest API in .97 version *ExploringCompactionPolicy*<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/compactions/ExploringCompactionPolicy.html>, which allow application when we should have a major compaction.
For stripe compaction, it is very interesting, will look into it. Thanks. Yun On Sat, Jun 22, 2013 at 9:24 AM, Jean-Marc Spaggiari < [email protected]> wrote: > Hi Yun, > > There is more differences. > > The minor compactions are not remove the delete flags and the deleted > cells. It only merge the small files into a bigger one. Only the major > compaction (in 0.94) will deal with the delete cells. There is also > some more compaction mechanism coming in trunk with nice features. > > Look at: https://issues.apache.org/jira/browse/HBASE-7902 > https://issues.apache.org/jira/browse/HBASE-7680 > https://issues.apache.org/jira/browse/HBASE-7680 > > Minor compactions are promoted to major compactions when the > compaction policy decide to compact all the files. If all the files > need to be merged, then we can run a major compaction which will do > the same thing as the minor one, but with the bonus of deleting the > required marked cells. > > JM > > 2013/6/22 yun peng <[email protected]>: > > Thanks, JM > > It seems like the sole difference btwn major and minor compaction is the > > number of files (to be all or just a subset of storefiles). It mentioned > > very briefly in > > http://hbase.apache.org/book< > http://hbase.apache.org/book/regions.arch.html>that > > "Sometimes a minor compaction will ... promote itself to being a major > > compaction". What does "sometime" exactly mean here? or any policy in > HBase > > that allow application to specify when to promote a minor compaction to > be > > a major (like user or some monitoring service can specify now is offpeak > > time?) > > Yun > > > > > > > > On Sat, Jun 22, 2013 at 8:51 AM, Jean-Marc Spaggiari < > > [email protected]> wrote: > > > >> Hi Yun, > >> > >> Few links: > >> - http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/ > >> => There is a small paragraph about compactions which explain when > >> they are triggered. > >> - http://hbase.apache.org/book/regions.arch.html 9.7.6.5 > >> > >> You are almost right. Only thing is that HBase doesn't know when is > >> your offpeak, so a major compaction can be triggered anytime if the > >> minor is promoted to be a major one. > >> > >> JM > >> > >> 2013/6/22 yun peng <[email protected]>: > >> > Hi, All > >> > > >> > I am asking the different practices of major and minor compaction... > My > >> > current understanding is that minor compaction, triggered > automatically, > >> > usually run along with online query serving (but in background), so > that > >> it > >> > is important to make it as lightweight as possible... to minimise > >> downtime > >> > (pause time) of online query. > >> > > >> > In contrast, the major compaction is invoked in offpeak time and > usually > >> > can be assume to have resource exclusively. It may have a different > >> > performance optimization goal... > >> > > >> > Correct me if wrong, but let me know if HBase does design different > >> > compaction mechanism this way..? > >> > > >> > Regards, > >> > Yun > >> >
