5 node cluster running 1.0.2, doing about 1300 reads and 1300 writes/sec into 3 
column families in the same keyspace.  2 client machines, doing about the same 
amount of reads/writes, but one has an average response time in the 4-40ms 
range and the other in the 200-800ms range.  Both running identical software, 
homebrew with hector-1.0-3 client.

Traffic was peaking out at 6k reads and 6k writes/sec, according to reporting 
from our software, and now it's topping out at 1300/sec each.  The cpus on the 
cassy boxes are bored.  None of the threads within cassandra are chewing more 
than 3% cpu.  Disk is only 10% full on the most loaded box.

MemtablePostFlusher               1       102             36

Not all servers have the same number of pending tasks.  They have 0, 1, 17, 37, 
and 105.

It looks like it's stuck and not recovering, cuz it's been like this for an 
hour.  I've attached the end of the cassandra.log from the server with the most 
pending tasks.  There are some interesting exceptions in there.

As always, all help is always appreciated!  :p


Attachment: cassandra.log
Description: Binary data

Reply via email to