[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.

Pavel Yaskevich (JIRA) Sun, 05 Oct 2014 02:53:51 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159494#comment-14159494
 ]


Pavel Yaskevich commented on CASSANDRA-7872:
--------------------------------------------

bq. Having Plan B is better for reliability, than not having it at all. This 
whole ticket is about Plan B. C* is recommended to run on Oracle's Java, it 
would not likely run on anything else (just b/c it uses undocumented classes, 
like sun.misc.Cleaner for example). Oracle's Java has contract for PhRefs and 
the way how it is implemented.

Having Plan B that could potentially never trigger and gives people force 
perspective on things is worse then not having one at all, also recommended 
doesn't mean required (and OpenJDK has sun.misc.Cleaner for that matter, but 
agent support is weaker comparing to Oracle JDK). You claim that there is a 
"contract" but the reality of things is that you failed to provide any proof 
that such contact exists even in Oracle JDK which explicitly mentions how/when 
phantom or any type of reference is supposed to be cleaned up except "If the 
garbage collector determines at a certain point in time" which is a broad 
definition and could mean anything e.g. only at Full GC time, as that behavior 
is left to be an implementation detail by JVM there is no guarantee that it's 
not going to change across releases or even that all of the garbage collector 
implementations are going to yield the same behavior, so things can go south 
and crash way before Plan B would trigger.

I made my opinion known and pretty much done arguing about this without any 
real arguments to support #3, so I will leave this to [~jbellis] to tie break 
and we'll go from there.

> ensure compacted obsolete sstables are not open on node restart and nodetool 
> refresh, even on sstable reference miscounting or deletion tasks are failed.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7872
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Oleg Anastasyev
>            Assignee: Oleg Anastasyev
>             Fix For: 2.0.11
>
>         Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt
>
>
> Since CASSANDRA-4436 compacted sstables are no more marked with 
> COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls 
> SSTableReader.markObsolete(), but the actual deletion is happening later on 
> SSTableReader.releaseReference().
> This reference counting is very fragile, it is very easy to introduce a 
> hard-to-catch and rare bug, so this reference count never reaches 0 ( like 
> CASSANDRA-6503 for example )
> This means, that very rarely obsolete sstable files are not removed from disk 
> (but are not used anymore by cassandra to read data).
> If more than gc grace time has passed since sstable file was not removed from 
> disk and operator issues either nodetool refresh or just reboots a node, 
> these obsolete files are being discovered and open for read by a node. So 
> deleted data is resurrected, being quickly spread by RR to whole cluster.
> Because consequences are very serious (even a single not removed obsolete 
> sstable file could render your data useless) this patch makes sure no 
> obsolete sstable file can be open for read by:
> 1. Removing sstables on CFS init analyzing sstable generations (sstable is 
> removed, if there are another sstable, listing this as ancestor)
> 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created 
> as soon as markObsolete is called. This is neccessary b/c generation info can 
> be lost (when sstables compact to none)
> 3. To remove sstables sooner then restart - reimplemented the good old GC 
> phantom reference queue as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.

Reply via email to