Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "MemtableSSTable" page has been changed by JonathanEllis. http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=14&rev2=15 -------------------------------------------------- == Compaction == To bound the number of SSTable files that must be consulted on reads, and to reclaim [[DistributedDeletes|space taken by unused data]], Cassandra performs compactions: merging multiple old SSTable files into a single new one. Compactions are triggered when at least N SStables have been flushed to disk, where N is tunable and defaults to 4. Four similar-sized SSTables are merged into a single one. They start out being the same size as your memtable flush size, and then form a hierarchy with each one doubling in size. So you'll have up to N of the same size as your memtable, then up to N double that size, then up to N double that size, etc. - "Minor" only compactions merge sstables of similar size; "major" compactions merge all sstables in a given !ColumnFamily. Only major compactions can clean out obsolete [[DistributedDeletes|tombstones]]. + "Minor" only compactions merge sstables of similar size; "major" compactions merge all sstables in a given !ColumnFamily. Prior to Cassandra 0.6.6/0.7.0, only major compactions can clean out obsolete [[DistributedDeletes|tombstones]]. Since the input SSTables are all sorted by key, merging can be done efficiently, still requiring no random i/o. Once compaction is finished, the old SSTable files may be deleted: note that in the worst case (a workload consisting of no overwrites or deletes) this will temporarily require 2x your existing on-disk space used. In today's world of multi-TB disks this is usually not a problem but it is good to keep in mind when you are setting alert thresholds.
