Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "LargeDataSetConsiderations" page has been changed by PeterSchuller. http://wiki.apache.org/cassandra/LargeDataSetConsiderations?action=diff&rev1=2&rev2=3 -------------------------------------------------- Unless otherwise noted, the points refer to Cassandra 0.7 and above. + * Disk space usage in Cassandra can vary fairly suddenly over time. If you have significant amounts of data such that available disk space is not significantly higher than usage, consider: + * Compaction of a column family can up to double the disk space used by said column family (in the case of a major compaction and no deletions). + * Repair operations can increase disk space demands (particularly in 0.6, less so in 0.7; TODO: provide actual maximum growth and what it depends on). * As your data set becomes larger and larger (assuming significantly larger than memory), you become more and more dependent on caching to elide I/O operations. As you plan and test your capacity, keep min mind that: * The cassandra row cache is in the JVM heap and un-affected (remains warm) by compactions and repair operations. * The key cache is affected by compaction and repair.
