unify code for major/minor compactions
--------------------------------------
Key: HBASE-3048
URL: https://issues.apache.org/jira/browse/HBASE-3048
Project: HBase
Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Today minor compactions do not process deletes, purge old versions, etc. Only
major compactions do. The rationale was probably to save CPU (?). We should
evaluate if major compaction logic indeed runs significantly slower.
Unifying minor compactions to do the same thing as major compactions has other
advantages:
* If the same data is overwritten several times and we are not processing
overwrites, it makes each subsequent minor compaction more expensive as the
total amount of data.
* We'll have fewer bugs if the logic is as symmetric as possible. Any bugs in
TTL enforcement, version enforcement, etc. could cause behavior to be different
after a major compaction. Keeping the same logic means these bugs will get
caught earlier.
-
Note: There will still need to be one difference in the two schemes, and that
has to do with delete markers. Any compaction which doesn't compact all files
will still need to leave delete markers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.