[Cassandra Wiki] Update of "LargeDataSetConsiderations" by PeterSchuller

Apache Wiki Tue, 12 Apr 2011 20:22:02 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "LargeDataSetConsiderations" page has been changed by PeterSchuller.
The comment on this change is: Reflect that CASSANDRA-2191 may be addressing 
compaction concurrency for 0.8.
http://wiki.apache.org/cassandra/LargeDataSetConsiderations?action=diff&rev1=17&rev2=18

--------------------------------------------------

   * Prior to 0.7.1 (fixed in 
[[https://issues.apache.org/jira/browse/CASSANDRA-1555|CASSANDRA-1555]]), if 
you had column families with more than 143 million row keys in them, bloom 
filter false positive rates would be likely to go up because of implementation 
concerns that limited the maximum size of a bloom filter. See 
[[ArchitectureInternals]] for information on how bloom filters are used. The 
negative effects of hitting this limit is that reads will start taking 
additional seeks to disk as the row count increases. Note that the effect you 
are seeing at any given moment will depend on when compaction was last run, 
because the bloom filter limit is per-sstable. It is an issue for column 
families because after a major compaction, the entire column family will be in 
a single sstable.
   * Compaction is currently not concurrent, so only a single compaction runs 
at a time. This means that sstable counts may spike during larger compactions 
as several smaller sstables are written while a large compaction is happening. 
This can cause additional seeks on reads.
    * Potential future improvements: 
[[https://issues.apache.org/jira/browse/CASSANDRA-1876|CASSANDRA-1876]] and 
[[https://issues.apache.org/jira/browse/CASSANDRA-1881|CASSANDRA-1881]]
+   * Potentially already fixed for 0.8 (todo: go through ticket history and 
make sure what it implies): 
[[https://issues.apache.org/jira/browse/CASSANDRA-2191|CASSANDRA-2191]]
   * Consider the choice of file system. Removal of large files is notoriously 
slow and seek bound on e.g. ext2/ext3. Consider xfs or ext4fs. This affects 
background unlink():ing of sstables that happens every now and then, and also 
affects start-up time (if there are sstables pending removal when a node is 
starting up, they are removed as part of the start-up proceess; it may thus be 
detrimental if removing a terrabyte of sstables takes an hour (numbers are 
ballparks, not accurately measured and depends on circumstances)).
   * Adding nodes is a slow process if each node is responsible for a large 
amount of data. Plan for this; do not try to throw additional hardware at a 
cluster at the last minute.
   * Cassandra will read through sstable index files on start-up, doing what is 
known as "index sampling". This is used to keep a subset (currently and by 
default, 1 out of 100) of keys and and their on-disk location in the index, in 
memory. See [[ArchitectureInternals]]. This means that the larger the index 
files are, the longer it takes to perform this sampling. Thus, for very large 
indexes (typically when you have a very large number of keys) the index 
sampling on start-up may be a significant issue.

[Cassandra Wiki] Update of "LargeDataSetConsiderations" by PeterSchuller

Reply via email to