[ 
https://issues.apache.org/jira/browse/CASSANDRA-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586793#comment-15586793
 ] 

Sharvanath Pathak edited comment on CASSANDRA-12778 at 10/18/16 9:59 PM:
-------------------------------------------------------------------------

There are multiple problems right now:
i) There can be cases when the tombstones can be looping in the cluster 
forever. For example, even after a node purges a tombstones it can receive the 
same from read-repair and can rewrite this old tombstones as unrepaired. We are 
seeing this happen even with read-repair disabled and are debugging what else 
could cause this.

iI) Flush almost always triggers a compaction for Leveled strategy. This causes 
the data in these old SSTables to not be repaired because the old SSTable which 
has been repaired is gone when it comes to marking them repaired. Seems like 
this is a race and compaction should be disallowed (on some fine granularity, 
e.g. keyspace) when repair is running.

iiI) The check for whether there is any overlapping data in other SSTables is 
very coarse in granularity in that it looks at the minimumTimestamp of the 
non-compacting SSTables.



was (Author: sharvanath):
There are multiple problems right now:
i) There can be cases when the tombstones can be looping in the cluster 
forever. For example, even after a node purges a tombstones it can receive the 
same from read-repair and can rewrite this old tombstones as unrepaired. We are 
seeing this happen even with read-repair disabled and are debugging what else 
could cause this.

iI) Flush almost always triggers a compaction. This causes the data in these 
old SSTables to not be repaired because the old SSTable which has been repaired 
is gone when it comes to marking them repaired. Seems like this is a race and 
compaction should be disallowed (on some fine granularity, e.g. keyspace) when 
repair is running.

iiI) The check for whether there is any overlapping data in other SSTables is 
very coarse in granularity in that it looks at the minimumTimestamp of the 
non-compacting SSTables.


> Tombstones not being deleted when only_purge_repaired_tombstones is enabled
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12778
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12778
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Arvind Nithrakashyap
>            Assignee: Marcus Eriksson
>            Priority: Critical
>
> When we use only_purge_repaired_tombstones for compaction, we noticed that 
> tombstones are no longer being deleted.
> {noformat}compaction = {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
> 'only_purge_repaired_tombstones': 'true'}{noformat}
> The root cause for this seems to be caused by the fact that repair itself 
> issues a flush which in turn leads to a new sstable being created (which is 
> not in the repair set). It looks like we do have some old data in this 
> sstable because of this, only tombstones older than that timestamp are 
> getting deleted even though many more keys have been repaired. 
> Fundamentally it looks like flush and repair can race with each other and 
> with leveled compaction, the flush creates a new sstable at level 0 and 
> removes the older sstable (the one that is picked for repair). Since repair 
> itself seems to issue multiple flushes, the level 0 sstable never gets 
> repaired and hence tombstones never get deleted. 
> We have already included the fix for CASSANDRA-12703 while testing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to