[ 
https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159467#comment-14159467
 ] 

Oleg Anastasyev commented on CASSANDRA-7872:
--------------------------------------------

I suggest youre talking not about STW but about that the moment of cleanup 
cannot be predicted and guaranteed to happen. Yes, if you build your entire 
critical resource cleanup system on ref queues, it is a problem. 

But this is not this case in this particular ticket. Delete of much of 
resources happen normally when sstable ref count reaches 0, i.e. as predictable 
as before. Cleanup on Ref queue is used only for those, which miscount their 
refs, so they will never be deleted, until restart. Having it deleted on next 
CMS gc (i.e. at some not exactly predictable point in near future) is IMO 
better than having them never deleted. 
Some of our c* clusters are running for several months and even years with no 
reboot, so not cleaning up resources until restart is an operational pain.

As I can see CASSANDRA-7705 is planned for 3.0, which is far from being ready. 
So i suggest it is better to have some resources cleanup code now, in 2.0 and 
2.1. It could be removed later on 3.0 release. 

Another benefit of having #3 is that it could help to catch bugs reference 
miscounts not cleaning up resources, b/c it notes to logs when it detected the 
miscount. Without #3 miscounts pass unnoticed. This could be used debugging 
CASSANDRA-7705  as well.

> ensure compacted obsolete sstables are not open on node restart and nodetool 
> refresh, even on sstable reference miscounting or deletion tasks are failed.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7872
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Oleg Anastasyev
>            Assignee: Oleg Anastasyev
>             Fix For: 2.0.11
>
>         Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt
>
>
> Since CASSANDRA-4436 compacted sstables are no more marked with 
> COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls 
> SSTableReader.markObsolete(), but the actual deletion is happening later on 
> SSTableReader.releaseReference().
> This reference counting is very fragile, it is very easy to introduce a 
> hard-to-catch and rare bug, so this reference count never reaches 0 ( like 
> CASSANDRA-6503 for example )
> This means, that very rarely obsolete sstable files are not removed from disk 
> (but are not used anymore by cassandra to read data).
> If more than gc grace time has passed since sstable file was not removed from 
> disk and operator issues either nodetool refresh or just reboots a node, 
> these obsolete files are being discovered and open for read by a node. So 
> deleted data is resurrected, being quickly spread by RR to whole cluster.
> Because consequences are very serious (even a single not removed obsolete 
> sstable file could render your data useless) this patch makes sure no 
> obsolete sstable file can be open for read by:
> 1. Removing sstables on CFS init analyzing sstable generations (sstable is 
> removed, if there are another sstable, listing this as ancestor)
> 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created 
> as soon as markObsolete is called. This is neccessary b/c generation info can 
> be lost (when sstables compact to none)
> 3. To remove sstables sooner then restart - reimplemented the good old GC 
> phantom reference queue as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to