[Cassandra Wiki] Update of "MemtableSSTable" by Paul Pr escod

Apache Wiki Tue, 13 Apr 2010 15:50:17 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "MemtableSSTable" page has been changed by Paul Prescod.
http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=5&rev2=6

--------------------------------------------------

  
  Once flushed, SSTable files are immutable; no further writes may be done.  
So, on the read path, the server must (potentially, although it uses tricks 
like bloom filters to avoid doing so unnecessarily) combine row fragments from 
all the SSTables on disk, as well as any unflushed Memtables, to produce the 
requested data.
  
- To bound the number of SSTable files that must be consulted on reads, and to 
reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs 
compactions: merging multiple old SSTable files into a single new one. 
Compactions are triggered when at least 4 SStables has been flushed to disk. 
Since the input SSTables are all sorted by key, merging can be done 
efficiently, still requiring no random i/o.  Once compaction is finished, the 
old SSTable files may be deleted: note that in the worst case (a workload 
consisting of no overwrites or deletes) this will temporarily require 2x your 
existing on-disk space used.  In today's world of multi-TB disks this is 
usually not a problem but it is good to keep in mind when you are setting alert 
thresholds.
+ To bound the number of SSTable files that must be consulted on reads, and to 
reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs 
compactions: merging multiple old SSTable files into a single new one. 
Compactions are triggered when at least 4 SStables have been flushed to disk. 
Four similar-sized SSTables are merged into a single one. They start out being 
the same size as your memtable flush size, and then form a hierarchy with each 
one doubling in size. So you'll have up to 4 of the same size as your memtable, 
then up to 4 double that size, then up to 4 double that size, etc. 
+ 
+ Since the input SSTables are all sorted by key, merging can be done 
efficiently, still requiring no random i/o.  Once compaction is finished, the 
old SSTable files may be deleted: note that in the worst case (a workload 
consisting of no overwrites or deletes) this will temporarily require 2x your 
existing on-disk space used.  In today's world of multi-TB disks this is 
usually not a problem but it is good to keep in mind when you are setting alert 
thresholds.
  
  (The high-level memtable/sstable design as well as the "Memtable" and 
"SSTable" names come from Cassandra's sections 5.3 and 5.4 of 
[[http://labs.google.com/papers/bigtable.html|Google's Bigtable paper]], 
although some of the terminology around compaction differs.)

[Cassandra Wiki] Update of "MemtableSSTable" by Paul Pr escod

Reply via email to