Duo Zhang created HBASE-16223:
---------------------------------
Summary: Drop duplicated delete markers in minor compaction
Key: HBASE-16223
URL: https://issues.apache.org/jira/browse/HBASE-16223
Project: HBase
Issue Type: Improvement
Reporter: Duo Zhang
Recently we suffer from this. One of our customers may delete the same row
multiple times(the record is about 100, 000 times), and cause scan timeout.
Now we trigger major compaction every day to drop the duplicated delete
markers. But this is not a good idea since the cost of major compaction gets
higher as the data gets larger.
And in fact, I think only the newest delete marker is useful(if maxverions =
1), so we could only retain this delete marker when doing minor compaction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)