[
https://issues.apache.org/jira/browse/CASSANDRA-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergio Bossa updated CASSANDRA-9238:
------------------------------------
Attachment: 2.0-CASSANDRA-9238-v2.txt
[~brandon.williams], I've attached a v2 patch addressing the problem in a
different way: that is, by properly closing all established connections when
the {{MessagingService}} shutdown, so that the sender node will end up creating
new connections once the shutdown node starts listening again.
This seems to fix my original problem: I've verified by commenting out the
previous (now committed) patch, so you might want to eventually revert it.
Also, this _might_ fix CASSANDRA-8072 as well (I verified via netstat all
connections are actually closed), but I only glanced through your last comments
there, so I might be wrong.
> Race condition after shutdown gossip message
> --------------------------------------------
>
> Key: CASSANDRA-9238
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9238
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sergio Bossa
> Assignee: Sergio Bossa
> Priority: Minor
> Fix For: 2.0.15, 2.1.6
>
> Attachments: 2.0-CASSANDRA-9238-v2.txt, 2.0-CASSANDRA-9238.txt
>
>
> CASSANDRA-8336 introduced a race condition causing gossip messages to be sent
> to shutdown nodes even if they have been already marked dead.
> That's because CASSANDRA-8336 changed (among other things) the way the
> SHUTDOWN gossip message is sent by moving it before the gossip task (the one
> sending SYN messages), and by putting a few secs wait between the two; this
> opens a race window by the receiving side between the time the SHUTDOWN
> message is received, causing the outbound sockets to be closed, and the
> moment the other side listening socket is actually closed, meaning that any
> SYN gossip message exchanged in such window will reopen the sockets and never
> close them again, as the node is already marked dead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)