[ https://issues.apache.org/jira/browse/HBASE-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916006#action_12916006 ]
stack commented on HBASE-3048: ------------------------------ This is fine by me. The one objection I was going to raise was the delete markers story but you got that in your footnote. > unify code for major/minor compactions > -------------------------------------- > > Key: HBASE-3048 > URL: https://issues.apache.org/jira/browse/HBASE-3048 > Project: HBase > Issue Type: Improvement > Reporter: Kannan Muthukkaruppan > > Today minor compactions do not process deletes, purge old versions, etc. Only > major compactions do. The rationale was probably to save CPU (?). We should > evaluate if major compaction logic indeed runs significantly slower. > Unifying minor compactions to do the same thing as major compactions has > other advantages: > * If the same data is overwritten several times and we are not processing > overwrites, it makes each subsequent minor compaction more expensive as the > total amount of data. > * We'll have fewer bugs if the logic is as symmetric as possible. Any bugs in > TTL enforcement, version enforcement, etc. could cause behavior to be > different after a major compaction. Keeping the same logic means these bugs > will get caught earlier. > - > Note: There will still need to be one difference in the two schemes, and that > has to do with delete markers. Any compaction which doesn't compact all files > will still need to leave delete markers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.