Timestamp of Last Repair

2018-12-11 Thread Fred Habash
We are trying to detect a scenario where some of our smaller clusters go un-repaired for extended periods of times mostly due to defects in deployment pipelines or human errors. We would like to automate a check for clusters where nodes that go un-repaired for more than 7 days, to shoot out an

Blocked NTR request

2018-12-11 Thread Guan Sun
Hi, I'm using Cassandra 2.2.8 with default NTR queque configurations ( max_queued_native_transtport_requests = 128, native_transport_max_threads = 128), and from the metrics I'm seeing some native transport requests are being blocked. I'm trying to understand what happens to the blocked native

Re: Cassandra single unreachable node causing total cluster outage

2018-12-11 Thread Agrawal, Pratik
Hello all, I’ve been doing more analysis and I’ve few questions: 1. We observed that most of the requests are blocked on NTR queue. I increased the queue size from 128 (default) to 1024 and this time the system does recover automatically (latencies go back to normal) without removing node

Re: 1.2.19: AssertionError when running compactions on a CF with TTLed columns

2018-12-11 Thread Reynald Borer
Hi everyone, I was finally able to sort out my problem in an "interesting" manner that I think is worth sharing on the list! What I did is the following: on each node, I stopped Cassandra, completely dropped the data files of the column family, started Cassandra again and issued a repair for