Re: Aws instance stop and star with ebs

2019-11-29 Thread Georg Brandemann
Hi Rahul Also have a look at https://issues.apache.org/jira/browse/CASSANDRA-14358 . We saw this on a 2.1.x cluster and there it also took ~10 minutes till the restarted node was really fully available in the cluster. the echo ACKs from some nodes simply seemed to never reach the target Georg

Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
Thanks Daemeon , will do that and post the results. I found jira in open state with similar issue https://issues.apache.org/jira/browse/CASSANDRA-13984 On Wed, Nov 6, 2019 at 1:49 PM daemeon reiydelle wrote: > No connection timeouts? No tcp level retries? I am sorry truly sorry but > you have

Re: Aws instance stop and star with ebs

2019-11-06 Thread daemeon reiydelle
No connection timeouts? No tcp level retries? I am sorry truly sorry but you have exceeded my capability. I have never seen a java.io timeout with out either a session half open failure (no response) or multiple retries. I am out of my depth, so please feel free to ignore but, did you see the

Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
tes, that sounds suspiciously like blocking on a default TCP > socket timeout. > > > > *From: *Rahul Reddy > *Reply-To: *"user@cassandra.apache.org" > *Date: *Wednesday, November 6, 2019 at 12:12 PM > *To: *"user@cassandra.apache.org" > *Subject: *Re: Aw

Re: Aws instance stop and star with ebs

2019-11-06 Thread Reid Pinchback
Almost 15 minutes, that sounds suspiciously like blocking on a default TCP socket timeout. From: Rahul Reddy Reply-To: "user@cassandra.apache.org" Date: Wednesday, November 6, 2019 at 12:12 PM To: "user@cassandra.apache.org" Subject: Re: Aws instance stop and star wi

Re: Aws instance stop and star with ebs

2019-11-06 Thread Rahul Reddy
Thank you. I have stopped instance in east. i see that all other instances can gossip to that instance and only one instance in west having issues gossiping to that node. when i enable debug mode i see below on the west node i see bellow messages from 16:32 to 16:47 DEBUG [RMI TCP

Re: Aws instance stop and star with ebs

2019-11-05 Thread daemeon reiydelle
10 minutes is 600 seconds, and there are several timeouts that are set to that, including the data center timeout as I recall. You may be forced to tcpdump the interface(s) to see where the chatter is. Out of curiosity, when you restart the node, have you snapped the jvm's memory to see if e.g.

Re: Aws instance stop and star with ebs

2019-11-05 Thread Rahul Reddy
Thanks Ben, Before stoping the ec2 I did run nodetool drain .so i ruled it out and system.log also doesn't show commitlogs being applied. On Tue, Nov 5, 2019, 7:51 PM Ben Slater wrote: > The logs between first start and handshaking should give you a clue but my > first guess would be

Re: Aws instance stop and star with ebs

2019-11-05 Thread Ben Slater
The logs between first start and handshaking should give you a clue but my first guess would be replaying commit logs. Cheers Ben --- *Ben Slater**Chief Product Officer*

Re: Aws instance stop and star with ebs

2019-11-05 Thread Rahul Reddy
I can reproduce the issue. I did drain Cassandra node then stop and started Cassandra instance . Cassandra instance comes up but other nodes will be in DN state around 10 minutes. I don't see error in the systemlog DN xx.xx.xx.59 420.85 MiB 256 48.2% id 2 UN

Re: Aws instance stop and star with ebs

2019-10-30 Thread Rahul Reddy
And also aws ec2 stop and start comes with new instance with same ip and all our file systems are in ebs mounted fine. Does coming new instance with same ip cause any gossip issues? On Tue, Oct 29, 2019, 6:16 PM Rahul Reddy wrote: > Thanks Alex. We have 6 nodes in each DC with RF=3 with CL

Re: Aws instance stop and star with ebs

2019-10-29 Thread Rahul Reddy
Thanks Alex. We have 6 nodes in each DC with RF=3 with CL local qourum . and we stopped and started only one instance at a time . Tough nodetool status says all nodes UN and system.log says canssandra started and started listening . Jmx explrter shows instance stayed down longer how do we

Re: Aws instance stop and star with ebs

2019-10-29 Thread Oleksandr Shulgin
On Tue, Oct 29, 2019 at 9:34 PM Rahul Reddy wrote: > > We have our infrastructure on aws and we use ebs storage . And aws was > retiring on of the node. Since our storage was persistent we did nodetool > drain and stopped and start the instance . This caused 500 errors in the > service. We have

Aws instance stop and star with ebs

2019-10-29 Thread Rahul Reddy
Hello, We have our infrastructure on aws and we use ebs storage . And aws was retiring on of the node. Since our storage was persistent we did nodetool drain and stopped and start the instance . This caused 500 errors in the service. We have local_quorum and rf=3 why does stopping one instance