Awesome! For a full explanation of what you are seeing (we call it micro batching) check out Adam Zegelins talk on it https://www.youtube.com/watch?v=wF3Ec1rdWgc
On Tue, 8 Nov 2016 at 02:21 Rajesh Radhakrishnan < [email protected]> wrote: > > Hi, > > Just found that reducing the batch size below 20 also increases the > writing speed and reduction in memory usage(especially for Python driver). > > Kind regards, > Rajesh R > > ------------------------------ > *From:* Ben Bromhead [[email protected]] > *Sent:* 07 November 2016 05:44 > *To:* [email protected] > *Subject:* Re: Are Cassandra writes are faster than reads? > > They can be and it depends on your compaction strategy :) > > On Sun, 6 Nov 2016 at 21:24 Ali Akhtar <[email protected] > <http://redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>> > wrote: > > tl;dr? I just want to know if updates are bad for performance, and if so, > for how long. > > On Mon, Nov 7, 2016 at 10:23 AM, Ben Bromhead <[email protected] > <http://redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..> > > wrote: > > Check out https://wiki.apache.org/cassandra/WritePathForUsers > <http://redir.aspx?REF=z6gebtTM9Bi4b1ZEZqnpcgJOwnifCWloccEOX28F8UC8Cvd6wAfUCAFodHRwczovL3dpa2kuYXBhY2hlLm9yZy9jYXNzYW5kcmEvV3JpdGVQYXRoRm9yVXNlcnM.> > for > the full gory details. > > On Sun, 6 Nov 2016 at 21:09 Ali Akhtar <[email protected] > <http://redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>> > wrote: > > How long does it take for updates to get merged / compacted into the main > data file? > > On Mon, Nov 7, 2016 at 5:31 AM, Ben Bromhead <[email protected] > <http://redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..> > > wrote: > > To add some flavor as to how the commitlog implementation is so quick. > > It only flushes to disk every 10s by default. So writes are effectively > done to memory and then to disk asynchronously later on. This is generally > accepted to be OK, as the write is also going to other nodes. > > You can of course change this behavior to flush on each write or to skip > the commitlog altogether (danger!). This however will change how "safe" > things are from a durability perspective. > > On Sun, Nov 6, 2016, 12:51 Jeff Jirsa <[email protected] > <http://redir.aspx?REF=CSJmlUdwjTSoe3NQdZNlO6pFPeaI_KxNpZweB-GbDYO8Cvd6wAfUCAFtYWlsdG86amVmZi5qaXJzYUBjcm93ZHN0cmlrZS5jb20.>> > wrote: > > Cassandra writes are particularly fast, for a few reasons: > > > > 1) Most writes go to a commitlog (append-only file, written > linearly, so particularly fast in terms of disk operations) and then pushed > to the memTable. Memtable is flushed in batches to the permanent data > files, so it buffers many mutations and then does a sequential write to > persist that data to disk. > > 2) Reads may have to merge data from many data tables on disk. > Because the writes (described very briefly in step 1) write to immutable > files, updates/deletes have to be merged on read – this is extra effort for > the read path. > > > > If you don’t do much in terms of overwrites/deletes, and your partitions > are particularly small, and your data fits in RAM (probably mmap/page cache > of data files, unless you’re using the row cache), reads may be very fast > for you. Certainly individual reads on low-merge workloads can be < 0.1ms. > > > > - Jeff > > > > *From: *Vikas Jaiman <[email protected] > <http://redir.aspx?REF=VgqqnBUEzP6sLWofnDxFp3iyHQ4TGCTJL8MbqH0NOUK8Cvd6wAfUCAFtYWlsdG86ZXIudmlrYXNqYWltYW5AZ21haWwuY29t> > > > *Reply-To: *"[email protected] > <http://redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>" > <[email protected] > <http://redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..> > > > *Date: *Sunday, November 6, 2016 at 12:42 PM > *To: *"[email protected] > <http://redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>" > <[email protected] > <http://redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..> > > > *Subject: *Are Cassandra writes are faster than reads? > > > > Hi all, > > > > Are Cassandra writes are faster than reads ?? If yes, why is this so? I am > using consistency 1 and data is in memory. > > > > Vikas > > -- > Ben Bromhead > CTO | Instaclustr > <http://redir.aspx?REF=N46JHXr59B026V3xSfBozh2xZoVS0DwdAV5Sm_LybJG8Cvd6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> > +1 650 284 9692 > Managed Cassandra / Spark on AWS, Azure and Softlayer > > > -- > Ben Bromhead > CTO | Instaclustr > <http://redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> > +1 650 284 9692 > Managed Cassandra / Spark on AWS, Azure and Softlayer > > > -- > Ben Bromhead > CTO | Instaclustr > <http://redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> > +1 650 284 9692 <(650)%20284-9692> > Managed Cassandra / Spark on AWS, Azure and Softlayer > ************************************************************************** > The information contained in the EMail and any attachments is confidential > and intended solely and for the attention and use of the named > addressee(s). It may not be disclosed to any other person without the > express authority of Public Health England, or the intended recipient, or > both. If you are not the intended recipient, you must not disclose, copy, > distribute or retain this message or any part of it. This footnote also > confirms that this EMail has been swept for computer viruses by > Symantec.Cloud, but please re-sweep any attachments before opening or > saving. http://www.gov.uk/PHE > ************************************************************************** > -- Ben Bromhead CTO | Instaclustr <https://www.instaclustr.com/> +1 650 284 9692 Managed Cassandra / Spark on AWS, Azure and Softlayer
