How long does it take for updates to get merged / compacted into the main data file?
On Mon, Nov 7, 2016 at 5:31 AM, Ben Bromhead <b...@instaclustr.com> wrote: > To add some flavor as to how the commitlog implementation is so quick. > > It only flushes to disk every 10s by default. So writes are effectively > done to memory and then to disk asynchronously later on. This is generally > accepted to be OK, as the write is also going to other nodes. > > You can of course change this behavior to flush on each write or to skip > the commitlog altogether (danger!). This however will change how "safe" > things are from a durability perspective. > > On Sun, Nov 6, 2016, 12:51 Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote: > >> Cassandra writes are particularly fast, for a few reasons: >> >> >> >> 1) Most writes go to a commitlog (append-only file, written >> linearly, so particularly fast in terms of disk operations) and then pushed >> to the memTable. Memtable is flushed in batches to the permanent data >> files, so it buffers many mutations and then does a sequential write to >> persist that data to disk. >> >> 2) Reads may have to merge data from many data tables on disk. >> Because the writes (described very briefly in step 1) write to immutable >> files, updates/deletes have to be merged on read – this is extra effort for >> the read path. >> >> >> >> If you don’t do much in terms of overwrites/deletes, and your partitions >> are particularly small, and your data fits in RAM (probably mmap/page cache >> of data files, unless you’re using the row cache), reads may be very fast >> for you. Certainly individual reads on low-merge workloads can be < 0.1ms. >> >> >> >> - Jeff >> >> >> >> *From: *Vikas Jaiman <er.vikasjai...@gmail.com> >> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Date: *Sunday, November 6, 2016 at 12:42 PM >> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Subject: *Are Cassandra writes are faster than reads? >> >> >> >> Hi all, >> >> >> >> Are Cassandra writes are faster than reads ?? If yes, why is this so? I >> am using consistency 1 and data is in memory. >> >> >> >> Vikas >> > -- > Ben Bromhead > CTO | Instaclustr <https://www.instaclustr.com/> > +1 650 284 9692 > Managed Cassandra / Spark on AWS, Azure and Softlayer >