[Cassandra Wiki] Trivial Update of "DistributedDeletes" by EricTamme

Apache Wiki Mon, 12 Sep 2011 10:40:00 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "DistributedDeletes" page has been changed by EricTamme:
http://wiki.apache.org/cassandra/DistributedDeletes?action=diff&rev1=6&rev2=7

Comment:
updating info regarding GC tombstones to reflect versions 0.6.8+ where minor 
compactions GC tombstones as well

  
  There's one more piece to the problem: how do we know when it's safe to 
remove tombstones? In a fully distributed system, we can't. We could add a 
coordinator like !ZooKeeper, but that would pollute the simplicity of the 
design, as well as complicating ops -- then you'd essentially have two systems 
to monitor, instead of one. (This is not to say ZK is bad software -- I believe 
it is best in class at what it does -- only that it solves a problem that we do 
not wish to add to our system.)
  
- So, Cassandra does what distributed systems designers frequently do when 
confronted with a problem we don't know how to solve: define some additional 
constraints that turn it into one that we do. Here, we defined a constant, 
GCGraceSeconds, and had each node track tombstone age locally. Once it has aged 
past the constant, it can be GC'd during compaction (see [[MemtableSSTable]]). 
This means that if you have a node down for longer than GCGraceSeconds, you 
should treat it as a failed node and replace it as described in [[Operations]]. 
The default setting is very conservative, at 10 days; you can reduce that once 
you have Anti Entropy configured to your satisfaction. And of course if you are 
only running a single Cassandra node, you can reduce it to zero, and tombstones 
will be GC'd at the first major compaction.
+ So, Cassandra does what distributed systems designers frequently do when 
confronted with a problem we don't know how to solve: define some additional 
constraints that turn it into one that we do. Here, we defined a constant, 
GCGraceSeconds, and had each node track tombstone age locally. Once it has aged 
past the constant, it can be GC'd during compaction (see [[MemtableSSTable]]). 
This means that if you have a node down for longer than GCGraceSeconds, you 
should treat it as a failed node and replace it as described in [[Operations]]. 
The default setting is very conservative, at 10 days; you can reduce that once 
you have Anti Entropy configured to your satisfaction. And of course if you are 
only running a single Cassandra node, you can reduce it to zero, and tombstones 
will be GC'd at the first major compaction.  Since 0.6.8, minor compactions 
also GC tombstones.

[Cassandra Wiki] Trivial Update of "DistributedDeletes" by EricTamme

Reply via email to