[ 
https://issues.apache.org/jira/browse/CASSANDRA-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913116#comment-13913116
 ] 

Ananthkumar K S commented on CASSANDRA-6772:
--------------------------------------------

Firewall is perfectly fine. We had a detailed analysis of the TCP dump . It 
showed active connection retries. So , it look like there is a TCP listen drop 
at application layer,. I had reported a similar kind of bug previously in 
stackoverflow for 2.0.1 but was told to upgrade to 2.0.3. In tease of 2.0.1 , 
there were no retries. But now they have put in an infinite loop to detect the 
servers. This causes the MessagingService to continuously run unless TCP 
handshaking completes .

Previous error :
https://issues.apache.org/jira/browse/CASSANDRA-6349


> Cassandra inter data center communication broken
> ------------------------------------------------
>
>                 Key: CASSANDRA-6772
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6772
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: CentOS 6.0
>            Reporter: Ananthkumar K S
>            Priority: Blocker
>
> I have two data enters DC1 and DC2. Both communicate via a private link. 
> Yesterday, we had a problem with a private link for 10 mins. From the time 
> the problem was resolved, nodes in both data centers are not able to 
> communicate with each other. When I do a nodetool status on a node in DC1, 
> the nodes in DC2 are stated as down. When tried in DC2, nodes in DC1 are 
> shown as down .
> But in the cassandra logs, we can clearly see that handshaking is failing 
> every 5 seconds for communication between data centres. At TCP level, there 
> are too many fin_wait1 generated by cassandra which is still a puzzle . 
> Closed_wait top transitions due to this is very high. Due to this kind of 
> problem of TCP listen drops, we moved from 2.0.1 to 2.0.3. In 2.0.1, it was 
> within data center itself. But here it's between data centers. If it has 
> anything to do with the snitch configuration, I am using 
> GossipingPropertyFileSnitch.
> This clearly started happening post private link failure. Any idea on this?
> Cassandra version used is 2.0.3



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to