I am just restarting cassandra. I'm not having any disk space issues I think, but we're having issues where operations have increased latency, and these are fixed by a restart. It seemed like the load reported by nodetool status might be helpful in understanding what is going wrong but I'm not sure. Another symptom is that nodes will report as DN in nodetool status and then come back up again just a minute later.

I'm not really sure what to track to find out what exactly is going wrong on the cluster, so any insight or debugging techniques would be super helpful


On May 31 2017, at 5:07 pm, Anthony Grasso <anthony.gra...@gmail.com> wrote:
Hi Daniel,

When you say that the nodes have to be restarted, are you just restarting the Cassandra service or are you restarting the machine?
How are you reclaiming disk space at the moment? Does disk space free up after the restart?

Regarding storage on nodes, keep in mind the more data stored on a node, the longer some operations to maintain that data will take to complete. In addition, the more data that is on each node, the long it will take to stream data to other nodes. Whether it is replacing a down node or inserting a new node, having a large amount of data on each node will mean that it takes longer for a node to join the cluster if it is streaming the data.

Kind regards,
Anthony

On 30 May 2017 at 02:43, Daniel Steuernol <dan...@sendwithus.com> wrote:
The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker to decide on whether to increase disk space vs provisioning a new node?



On May 29 2017, at 9:35 am, tommaso barbugli <tbarbu...@gmail.com> wrote:
Hi Daniel,

This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)?

Cheers,
Tommaso

On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol <dan...@sendwithus.com> wrote:

I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is this normal? Also does cassandra need to be restarted every few days for best performance? Any insight on this behaviour would be helpful.

Cheers,
Daniel
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org

--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org

--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to