Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FAQ" page has been changed by JonathanEllis.
The comment on this change is: move hardware to separate page.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=28&rev2=29

--------------------------------------------------

  <<Anchor(what_kind_of_hardware_should_i_use)>>
  == What kind of hardware should I run Cassandra on? ==
  
+ See [CassandraHardware].
- === Memory ===
- The most recently written data resides in memory tables (aka 
[[MemtableThresholds|memtables]]), but older data that has been flushed to disk 
can be kept in the OS's file-system cache. In other words, ''the more memory, 
the better'', with 1GB being the minimum recommended.
- 
- === CPU ===
- Many workloads will actually be CPU-bound in Cassandra before being 
memory-bound.  Cassandra is highly concurrent and will make good use of however 
many cores you can give it.
- 
- 
- === Disk ===
- The short answer here is, ''at least 2 disks'', one to keep your 
`CommitLogDirectory` on, the other to use in `DataFileDirectories`. The exact 
answer though depends a lot on your usage so it's important to understand what 
is going on here.
- 
- Cassandra persists data to disk for two very different purposes. The first, 
when a new write is made so that it can be replayed after a crash or system 
shutdown. The second when thresholds are exceeded and memtables are flushed to 
disk as SSTables.
- 
- Commit logs receive every write made to a Cassandra node and have the 
potential to block client operations, but they are only ever read on node 
start-up. SSTables writes on the other hand occur asynchronously, but are read 
to satisfy client look-ups. SSTables are also periodically merged and rewritten 
in a process called ''compaction''. Another important distinction is that 
commit logs are purged after the corresponding data has been flushed to disk as 
an SSTable, so `CommitLogDirectory` only holds uncommitted data while the 
directories in `DataFileDirectories` store all of the data written to a node.
- 
- So to summarize, use a different device for your `CommitLogDirectory`; it 
needn't be large, but it should be fast enough to receive all of your writes. 
Then, use one or more devices for `DataFileDirectories` and make sure they are 
both large enough to house all of your data, and fast enough to satisfy your 
reads and to keep up with flushing and compaction.
  
  <<Anchor(architecture)>>
  == What are SSTables and Memtables? ==

Reply via email to