[
https://issues.apache.org/jira/browse/CASSANDRA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908792#action_12908792
]
Jonathan Ellis commented on CASSANDRA-1074:
-------------------------------------------
committed w/ minor changes.
happy to merge backport to 0.6 as well.
Thanks Sylvain!
> check bloom filters to make minor compaction able to delete (some) tombstones
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-1074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1074
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Robert Coli
> Assignee: Sylvain Lebresne
> Fix For: 0.7 beta 2
>
> Attachments:
> 0001-Purge-tombstone-on-minor-compaction-after-gc_grace_p.patch
>
>
> Given a tombstoned key which is older than GCGraceSeconds, current (0.6.1)
> compaction implementation still requires a major compaction for the key to
> actually be deleted. The major compaction is required is because we must know
> whether there is a version of the key inside all SSTables associated with the
> columnfamily, including ones not involved in minor compactions. However, as
> we have bloom filters into each one of these SSTables, minor compaction can
> relatively inexpensively check for existence of this key in SSTable files not
> involved in the current minor compaction, and thereby delete the key,
> assuming all bloom filters return negative. If the filter returns positive, a
> major compaction would of course still be required.
> For use cases like CASSANDRA-1041 where users are strongly motivated to not
> do a major compaction, this seems to answer the jbellis objection :
> "You don't want to skip large files in major compactions, since the
> definition of major is "compact everything so it is safe to remove
> tombstones." "
> The above described improvement appears to provide "safe to remove (some)
> tombstones" without requiring "compact everything", and so may be a useful
> optimization.
> =Rob
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.