[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Apache Wiki Mon, 08 Feb 2010 11:35:27 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "FAQ" page has been changed by JonathanEllis.
The comment on this change is: link MemtableSSTable.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=36&rev2=37

--------------------------------------------------

  
  <<Anchor(reads_slower_writes)>>
  == Why are reads slower than writes? ==
- Unlike all major relational databases and some NoSQL systems, Cassandra does 
not use b-trees and in-place updates on disk.  Instead, it uses a 
sstable/memtable model like Bigtable's: writes to each ColumnFamily are grouped 
together in an in-memory structure before being flushed (sorted and written to 
disk).  Thus, writes are extremely fast, costing only a commitlog append and an 
amortized sequential write for the flush.  This means that writes cost no 
random I/O, compared to a b-tree system which not only has to seek to the data 
location to overwrite, but also may have to seek to read different levels of 
the index if it outgrows disk cache!  
+ Unlike all major relational databases and some NoSQL systems, Cassandra does 
not use b-trees and in-place updates on disk.  Instead, it uses a 
sstable/memtable model like Bigtable's: writes to each ColumnFamily are grouped 
together in an in-memory structure before being flushed (sorted and written to 
disk).  This means that writes cost no random I/O, compared to a b-tree system 
which not only has to seek to the data location to overwrite, but also may have 
to seek to read different levels of the index if it outgrows disk cache!  
  
- The downside is that on a read, Cassandra has to (potentially) merge row 
fragments from multiple sstables on disk.  We think this is a tradeoff worth 
making, first because scaling writes has always been harder than scaling reads, 
and second because as your data corpus grows Cassandra's read disadvantage 
narrows vs b-tree systems that have to do multiple seeks against a large index.
+ The downside is that on a read, Cassandra has to (potentially) merge row 
fragments from multiple sstables on disk.  We think this is a tradeoff worth 
making, first because scaling writes has always been harder than scaling reads, 
and second because as your data corpus grows Cassandra's read disadvantage 
narrows vs b-tree systems that have to do multiple seeks against a large index. 
 See MemtableSSTable for more details.

[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Reply via email to