Sstables must be sorted by token, or we can't compact efficiently. Since writes usually do not arrive in token order, we stage them first in a memtable.
(cc user@) On Thu, May 23, 2013 at 8:44 AM, Ansar Rafique <ansa...@hotmail.com> wrote: > Hi Jonathan, > > I am Ansar Rafique and I asked you few questions 2 week ago about Cassandra > Implementation. I was watching your presentation where you suggested the > page below. > > http://nosql.mypopescu.com/post/27684111441/cassandra-and-solid-state-drives > > I have a question and I have tried to find the answer but didn't really get > satisfactory response yet. My question is why Cassandra using Commit log for > durability instead direct write to SSTable. Cassandra acheives high write > throughput because it stores data first in memtable and then flush into > disk. Sounds good but remeber Cassandra also write in commit log for > durability. I made it sure and it's written that write to memetable and > commit log is synchronous which means it will write first in commit log and > wait until it complete and will start writing in memtable or vice versa. > Writing transaction to commit log requires an I/O operation which means for > each insert we need an I/O :( for writing data in commit log and later > requires more I/O's to flush data again on disk. Isn't writing to commit log > is overhead ? Isn't it better to directly write data on disk instead of > commit log ? > > Remember I/O operations are expensive and reduction in I/O's mean > improvement in performance. If we look at RDBMS, it stores data in commit > log as well as disk. Fair enough but if we don't insert data in commit log. > It's performance should be the same as Cassandra because it perform I/O to > insert data on disk and Cassandra also perform's I/O to insert data on > commit log. Is commit log is less expensive ? I didn't really understood the > magic :) Would you like to elaborate it more ? > > Thank you in advance for your time. Looking to hear from you. > > Regards, > Ansar Rafique > > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced