Re: Cassandra single unreachable node causing total cluster outage

2018-12-12 Thread cclive1601你
> > > Thanks, > > Pratik > > *From: *"Agrawal, Pratik" > *Date: *Monday, December 3, 2018 at 11:55 PM > *To: *"user@cassandra.apache.org" , Marc > Selwan > *Cc: *Jeff Jirsa , Ben Slater < > ben.sla...@instaclustr.com> > *Subject

Re: Cassandra single unreachable node causing total cluster outage

2018-12-11 Thread Agrawal, Pratik
f Jirsa , Ben Slater Subject: Re: Cassandra single unreachable node causing total cluster outage Hello, 1. Cassandra latencies spiked 5-6 times the normal. (Read and write both). The latencies were in higher single digit seconds. 2. As I said in my previous email, we don’t bound the N

Re: Cassandra single unreachable node causing total cluster outage

2018-12-03 Thread Agrawal, Pratik
te: Sunday, December 2, 2018 at 4:33 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>>, Jeff Jirsa mailto:jji...@gmail.com>>, Ben Slater mailto:ben.sla...@instaclustr.com>> Subject: Re: Cassandra single unreachabl

Re: Cassandra single unreachable node causing total cluster outage

2018-12-02 Thread Marc Selwan
grawal, Pratik" > *Date: *Sunday, December 2, 2018 at 4:33 PM > *To: *"user@cassandra.apache.org" , Jeff Jirsa > , Ben Slater > > > *Subject: *Re: Cassandra single unreachable node causing total cluster > outage > > > > I looked into some

Re: Cassandra single unreachable node causing total cluster outage

2018-12-02 Thread Agrawal, Pratik
configuration). Thanks, Pratik From: Jeff Jirsa Reply-To: "user@cassandra.apache.org" Date: Tuesday, November 27, 2018 at 9:37 PM To: "user@cassandra.apache.org" Subject: Re: Cassandra single unreachable node causing total cluster outage Could also be the app not detecting the host

Re: Cassandra single unreachable node causing total cluster outage

2018-12-02 Thread Agrawal, Pratik
To: "user@cassandra.apache.org" , Jeff Jirsa , Ben Slater Subject: Re: Cassandra single unreachable node causing total cluster outage I looked into some of the logs and I saw that at the time of the event the Native requests started getting blocked. e.g. [INFO] org.apache.cassandra.utils.StatusLogg

Re: Cassandra single unreachable node causing total cluster outage

2018-11-27 Thread Jeff Jirsa
Could also be the app not detecting the host is down and it keeps trying to use it as a coordinator -- Jeff Jirsa > On Nov 27, 2018, at 6:33 PM, Ben Slater wrote: > > In what way does the cluster become unstable (ie more specifically what are > the symptoms)? My first thought would be the

Re: Cassandra single unreachable node causing total cluster outage

2018-11-27 Thread Ben Slater
In what way does the cluster become unstable (ie more specifically what are the symptoms)? My first thought would be the loss of the node causing the other nodes to become overloaded but that doesn’t seem to fit with your point 2. Cheers Ben --- *Ben Slater* *Chief Product Officer*

Cassandra single unreachable node causing total cluster outage

2018-11-27 Thread Agrawal, Pratik
Hello all, Setup: 18 Cassandra node cluster. Cassandra version 2.2.8 Amazon C3.2x large machines. Replication factor of 3 (in 3 different AZs). Read and Write using Quorum. Use case: 1. Short lived data with heavy updates (I know we are abusing Cassandra here) with gc grace period of 15