[
https://issues.apache.org/jira/browse/CASSANDRA-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sylvain Lebresne resolved CASSANDRA-5183.
-----------------------------------------
Resolution: Duplicate
Fix Version/s: (was: 1.2.2)
Seems like 4 months is the limit of my memory, this is the same as
CASSANDRA-4671.
> Improve cases where we purge tombstone on (minor) compaction
> ------------------------------------------------------------
>
> Key: CASSANDRA-5183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5183
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Priority: Minor
>
> Currently, to be able to purge a tombstone, we check that the row it is part
> of is not present in a non-compacted sstable, as we should not remove a
> tombstone that may delete other columns in the non-compacted sstables.
> The (known) problem is, if you regularly update a row on which you've made
> deletes, tombstone may theoretically be kept forever unless you run a major
> compaction (which is bad and not even a possibility with leveled compaction).
> In practice, with wide rows and more precisely time-series type of load, it
> is not unlikely that tombstones might be kept, if not forever, at least much
> longer than gcgrace.
> One avoid to improve on that would be to start storing the minTimestamp of
> sstables (like we keep the maxTimestamp). During compaction, on top checking
> bloom filters, we would also check if the max timestamp of what we're about
> to purge is smaller than the min timestamp of the non compact sstable. If it
> is, then whatever tombstone we are looking at cannot shadow something in the
> non-compacted sstable and we can purge it (that is, even if the row in
> question may have columns in those non-compacted sstables).
> Note that while this isn't perfect in theory:
> # this is cheap to check. We may even compute the min timestamp of the non
> compacted sstable once at the beginning of the compaction and check that
> before looking at the BF, which may save a few intervalTree search (if we do
> end up doing the intervalTree search however, we might still want recomputing
> the min timestamp of the returned sstable as this may be bigger that the min
> timestamp of all the non compacted sstables).
> # both size tiered and leveled natural tend to compact sstable having data of
> rougthly the same age, so this should work reasonably well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira