[ https://issues.apache.org/jira/browse/CASSANDRA-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paulo Motta updated CASSANDRA-9630: ----------------------------------- Description: After upgrading from Cassandra from 2.0.12 to 2.0.15, whenever we killed a cassandra process (with SIGTERM), some other nodes maintained a connection with the killed node in the CLOSE_WAIT state on port 7000 for about 5-20 minutes. So, when we started the killed node again, other nodes could not establish a handshake because of the connections on the CLOSE_WAIT state, so they remained on the DOWN state to each other until the initial connection expired. The problem did not happen if I ran a nodetool disablegossip before killing the node. I was able to fix this issue by reverting the CASSANDRA-8336 commits (including CASSANDRA-9238). After reverting this, cassandra now closes connection correctly when killed with -TERM, but leaves connections on CLOSE_WAIT state if I run nodetool disablethrift before killing the nodes. I did not try to reproduce the problem in a clean environment. was: After upgrading from Cassandra from 2.0.12 to 2.0.15, whenever we killed a cassandra process (with SIGTERM), some other nodes maintained a connection with the killed node in the CLOSE_WAIT state on port 7000 for about 5-20 minutes. So, when we started the killed node again, other nodes could not establish a handshake because of the connections on the CLOSE_WAIT state, so they remained on the DOWN state to each other until the initial connection expired. The problem did not happen if I ran a nodetool disablegossip before killing the node. I was able to fix this issue by reverting the CASSANDRA-8336 commits (including CASSANDRA-9238). After reverting this, cassandra now closes conenction correctly when killed with -TERM, but leaves connections on CLOSE_WAIT state if I run nodetool disablethrift before killing the nodes. I did not try to reproduce the problem in a clean environment. > Killing cassandra process results in unclosed connections > --------------------------------------------------------- > > Key: CASSANDRA-9630 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9630 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Paulo Motta > Assignee: Brandon Williams > > After upgrading from Cassandra from 2.0.12 to 2.0.15, whenever we killed a > cassandra process (with SIGTERM), some other nodes maintained a connection > with the killed node in the CLOSE_WAIT state on port 7000 for about 5-20 > minutes. > So, when we started the killed node again, other nodes could not establish a > handshake because of the connections on the CLOSE_WAIT state, so they > remained on the DOWN state to each other until the initial connection expired. > The problem did not happen if I ran a nodetool disablegossip before killing > the node. > I was able to fix this issue by reverting the CASSANDRA-8336 commits > (including CASSANDRA-9238). After reverting this, cassandra now closes > connection correctly when killed with -TERM, but leaves connections on > CLOSE_WAIT state if I run nodetool disablethrift before killing the nodes. > I did not try to reproduce the problem in a clean environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)