Sorry to hear that Robert, I ran into similar issue a while ago. I had an extremely heavy write and update load, as a result Cassandra (1.2.9) was constantly flushing to disk and used to GC, tried exactly the same steps you tried (tuning memtable_flush_writers (to 2) and memtable_flush_queue_size (to 8) ) no luck. Almost all of the issues went away when I migrated to 1.2.13 this release also had some fixes which I badly needed. What version are you running ? (I tried to look in the thread but couldn't find one, sorry if this is a repeat question)
Dropped messages are the sign that Cassandra is taking heavy that's the load shedding mechanism. I would love to see some sort of back-pressure implemented. -sandeep On Tue, Feb 4, 2014 at 6:10 PM, Robert Wille <rwi...@fold3.com> wrote: > I ran my test again, and Flush Writer's "All time blocked" increased to 2 > and then shortly thereafter GC went into its death spiral. I doubled > memtable_flush_writers (to 2) and memtable_flush_queue_size (to 8) and > tried again. > > This time, the table that always sat with Memtable data size = 0 now > showed increases in Memtable data size. That was encouraging. It never > flushed, which isn't too surprising, because that table has relatively few > rows and they are pretty wide. However, on the fourth table to clean, Flush > Writer's "All time blocked" went to 1, and then there were no more > completed events, and about 10 minutes later GC went into its death spiral. > I assume that each time Flush Writer completes an event, that means a table > was flushed. Is that right? Also, I got two dropped mutation messages at > the same time that Flush Writer's All time blocked incremented. > > I then increased the writers and queue size to 3 and 12, respectively, and > ran my test again. This time All time blocked remained at 0, but I still > suffered death by GC. > > I would almost think that this is caused by high load on the server, but > I've never seen CPU utilization go above about two of my eight available > cores. If high load triggers this problem, then that is very disconcerting. > That means that a CPU spike could permanently cripple a node. Okay, not > permanently, but until a manual flush occurs. > > If anyone has any further thoughts, I'd love to hear them. I'm quite at > the end of my rope. > > Thanks in advance > > Robert > > From: Nate McCall <n...@thelastpickle.com> > Reply-To: <user@cassandra.apache.org> > Date: Saturday, February 1, 2014 at 9:25 AM > To: Cassandra Users <user@cassandra.apache.org> > Subject: Re: Lots of deletions results in death by GC > > What's the output of 'nodetool tpstats' while this is happening? > Specifically is Flush Writer "All time blocked" increasing? If so, play > around with turning up memtable_flush_writers and memtable_flush_queue_size > and see if that helps. > > > On Sat, Feb 1, 2014 at 9:03 AM, Robert Wille <rwi...@fold3.com> wrote: > >> A few days ago I posted about an issue I'm having where GC takes a long >> time (20-30 seconds), and it happens repeatedly and basically no work gets >> done. I've done further investigation, and I now believe that I know the >> cause. If I do a lot of deletes, it creates memory pressure until the >> memtables are flushed, but Cassandra doesn't flush them. If I manually >> flush, then life is good again (although that takes a very long time >> because of the GC issue). If I just leave the flushing to Cassandra, then I >> end up with death by GC. I believe that when the memtables are full of >> tombstones, Cassadnra doesn't realize how much memory the memtables are >> actually taking up, and so it doesn't proactively flush them in order to >> free up heap. >> >> As I was deleting records out of one of my tables, I was watching it via >> nodetool cfstats, and I found a very curious thing: >> >> Memtable cell count: 1285 >> Memtable data size, bytes: 0 >> Memtable switch count: 56 >> >> As the deletion process was chugging away, the memtable cell count >> increased, as expected, but the data size stayed at 0. No flushing >> occurred. >> >> Here's the schema for this table: >> >> CREATE TABLE bdn_index_pub ( >> >> tshard VARCHAR, >> >> pord INT, >> >> ord INT, >> >> hpath VARCHAR, >> >> page BIGINT, >> >> PRIMARY KEY (tshard, pord) >> >> ) WITH gc_grace_seconds = 0 AND compaction = { 'class' : >> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 }; >> >> I have a few tables that I run this cleaning process on, and not all of >> them exhibit this behavior. One of them reported an increasing number of >> bytes, as expected, and it also flushed as expected. Here's the schema for >> that table: >> >> >> CREATE TABLE bdn_index_child ( >> >> ptshard VARCHAR, >> >> ord INT, >> >> hpath VARCHAR, >> >> PRIMARY KEY (ptshard, ord) >> >> ) WITH gc_grace_seconds = 0 AND compaction = { 'class' : >> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 }; >> >> In both cases, I'm deleting the entire record (i.e. specifying just the >> first component of the primary key in the delete statement). Most records >> in bdn_index_pub have 10,000 rows per record. bdn_index_child usually has >> just a handful of rows, but a few records can have up 10,000. >> >> Still a further mystery, 1285 tombstones in the bdn_index_pub memtable >> doesn't seem like nearly enough to create a memory problem. Perhaps there >> are other flaws in the memory metering. Or perhaps there is some other >> issue that causes Cassandra to mismanage the heap when there are a lot of >> deletes. One other thought I had is that I page through these tables and >> clean them out as I go. Perhaps there is some interaction between the >> paging and the deleting that causes the GC problems and I should create a >> list of keys to delete and then delete them after I've finished reading the >> entire table. >> >> I reduced memtable_total_space_in_mb from the default (probably 2.7 GB) >> to 1 GB, in hopes that it would force Cassandra to flush tables before I >> ran into death by GC, but it didn't seem to help. >> >> I'm using Cassandra 2.0.4. >> >> Any insights would be greatly appreciated. I can't be the only one that >> has periodic delete-heavy workloads. Hopefully someone else has run into >> this and can give advice. >> >> Thanks >> >> Robert >> > > > > -- > ----------------- > Nate McCall > Austin, TX > @zznate > > Co-Founder & Sr. Technical Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com >