Hi Yun, There is more differences.
The minor compactions are not remove the delete flags and the deleted cells. It only merge the small files into a bigger one. Only the major compaction (in 0.94) will deal with the delete cells. There is also some more compaction mechanism coming in trunk with nice features. Look at: https://issues.apache.org/jira/browse/HBASE-7902 https://issues.apache.org/jira/browse/HBASE-7680 https://issues.apache.org/jira/browse/HBASE-7680 Minor compactions are promoted to major compactions when the compaction policy decide to compact all the files. If all the files need to be merged, then we can run a major compaction which will do the same thing as the minor one, but with the bonus of deleting the required marked cells. JM 2013/6/22 yun peng <pengyunm...@gmail.com>: > Thanks, JM > It seems like the sole difference btwn major and minor compaction is the > number of files (to be all or just a subset of storefiles). It mentioned > very briefly in > http://hbase.apache.org/book<http://hbase.apache.org/book/regions.arch.html>that > "Sometimes a minor compaction will ... promote itself to being a major > compaction". What does "sometime" exactly mean here? or any policy in HBase > that allow application to specify when to promote a minor compaction to be > a major (like user or some monitoring service can specify now is offpeak > time?) > Yun > > > > On Sat, Jun 22, 2013 at 8:51 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Hi Yun, >> >> Few links: >> - http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/ >> => There is a small paragraph about compactions which explain when >> they are triggered. >> - http://hbase.apache.org/book/regions.arch.html 9.7.6.5 >> >> You are almost right. Only thing is that HBase doesn't know when is >> your offpeak, so a major compaction can be triggered anytime if the >> minor is promoted to be a major one. >> >> JM >> >> 2013/6/22 yun peng <pengyunm...@gmail.com>: >> > Hi, All >> > >> > I am asking the different practices of major and minor compaction... My >> > current understanding is that minor compaction, triggered automatically, >> > usually run along with online query serving (but in background), so that >> it >> > is important to make it as lightweight as possible... to minimise >> downtime >> > (pause time) of online query. >> > >> > In contrast, the major compaction is invoked in offpeak time and usually >> > can be assume to have resource exclusively. It may have a different >> > performance optimization goal... >> > >> > Correct me if wrong, but let me know if HBase does design different >> > compaction mechanism this way..? >> > >> > Regards, >> > Yun >>