Ananthkumar K S created CASSANDRA-6772:
------------------------------------------
Summary: Cassandra inter data center communication broken
Key: CASSANDRA-6772
URL: https://issues.apache.org/jira/browse/CASSANDRA-6772
Project: Cassandra
Issue Type: Bug
Environment: CentOS 6.0
Reporter: Ananthkumar K S
Priority: Blocker
I have two data enters DC1 and DC2. Both communicate via a private link.
Yesterday, we had a problem with a private link for 10 mins. From the time the
problem was resolved, nodes in both data centers are not able to communicate
with each other. When I do a nodetool status on a node in DC1, the nodes in DC2
are stated as down. When tried in DC2, nodes in DC1 are shown as down .
But in the cassandra logs, we can clearly see that handshaking is failing every
5 seconds for communication between data centres. At TCP level, there are too
many fin_wait1 generated by cassandra which is still a puzzle . Closed_wait top
transitions due to this is very high. Due to this kind of problem of TCP listen
drops, we moved from 2.0.1 to 2.0.3. In 2.0.1, it was within data center
itself. But here it's between data centers. If it has anything to do with the
snitch configuration, I am using GossipingPropertyFileSnitch.
This clearly started happening post private link failure. Any idea on this?
Cassandra version used is 2.0.3
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)