Some random thoughts; I would like to thank you for giving us an interesting problem. Cassandra can get boring sometimes, it is too stable.
- Do you have a way to monitor the network traffic to see if it is increasing between restarts or does it seem relatively flat? - What activities are happening when you observe the (increasing) latencies? Something must be writing to keyspaces, something I presume is reading. What is the workload? - when using SSD, there are some /devices optimizations for SSD's. I wonder if those were done (they will cause some IO latency, but not like this) *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Thu, Jun 1, 2017 at 7:18 AM, Daniel Steuernol <dan...@sendwithus.com> wrote: > I am just restarting cassandra. I'm not having any disk space issues I > think, but we're having issues where operations have increased latency, and > these are fixed by a restart. It seemed like the load reported by nodetool > status might be helpful in understanding what is going wrong but I'm not > sure. Another symptom is that nodes will report as DN in nodetool status > and then come back up again just a minute later. > > I'm not really sure what to track to find out what exactly is going wrong > on the cluster, so any insight or debugging techniques would be super > helpful > > > On May 31 2017, at 5:07 pm, Anthony Grasso <anthony.gra...@gmail.com> > wrote: > >> Hi Daniel, >> >> When you say that the nodes have to be restarted, are you just restarting >> the Cassandra service or are you restarting the machine? >> How are you reclaiming disk space at the moment? Does disk space free up >> after the restart? >> >> Regarding storage on nodes, keep in mind the more data stored on a node, >> the longer some operations to maintain that data will take to complete. In >> addition, the more data that is on each node, the long it will take to >> stream data to other nodes. Whether it is replacing a down node or >> inserting a new node, having a large amount of data on each node will mean >> that it takes longer for a node to join the cluster if it is streaming the >> data. >> >> Kind regards, >> Anthony >> >> On 30 May 2017 at 02:43, Daniel Steuernol <dan...@sendwithus.com> wrote: >> >> The cluster is running with RF=3, right now each node is storing about >> 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 >> GB of RAM, and the disks attached for the data drive are gp2 ssd ebs >> volumes with 10k iops. I guess this brings up the question of what's a good >> marker to decide on whether to increase disk space vs provisioning a new >> node? >> >> >> >> On May 29 2017, at 9:35 am, tommaso barbugli <tbarbu...@gmail.com> >> wrote: >> >> Hi Daniel, >> >> This is not normal. Possibly a capacity problem. Whats the RF, how much >> data do you store per node and what kind of servers do you use (core count, >> RAM, disk, ...)? >> >> Cheers, >> Tommaso >> >> On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <dan...@sendwithus.com> >> wrote: >> >> >> I am running a 6 node cluster, and I have noticed that the reported load >> on each node rises throughout the week and grows way past the actual disk >> space used and available on each node. Also eventually latency for >> operations suffers and the nodes have to be restarted. A couple questions >> on this, is this normal? Also does cassandra need to be restarted every few >> days for best performance? Any insight on this behaviour would be helpful. >> >> Cheers, >> Daniel >> --------------------------------------------------------------------- To >> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >> additional commands, e-mail: user-h...@cassandra.apache.org >> >> >> --------------------------------------------------------------------- To >> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >> additional commands, e-mail: user-h...@cassandra.apache.org >> >> >> --------------------------------------------------------------------- To > unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional > commands, e-mail: user-h...@cassandra.apache.org