[ 
https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5669:
-----------------------------------

    Description: 
While debugging the upgrading scenario described in CASSANDRA-5660, I 
discovered the ITC.close() will reset the message protocol version of a peer 
node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
path, but basically the Ec2MultiRegionSnitch will close connections on the 
publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
message protocol version of previously known nodes. I think we want to hang 
onto that version so that when the newer node (re-)connects to the lower node 
version, it passes the correct protocol version rather than the current version 
(too high for the older node),the connection attempt getting dropped, and going 
through the dance again.

To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
Anecdotaly, perhaps one connection per second gets turned over.

  was:While debugging the upgrading scenario described in CASSANDRA-5660, I 
discovered the ITC.close() will reset the message protocol version of a peer 
node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
path, but basically the Ec2MultiRegionSnitch will close connections on the 
publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
message protocol version of previously known nodes. I think we want to hang 
onto that version so that when the newer node (re-)connects to the lower node 
version, it passes the correct protocol version rather than the current version 
(too high for the older node),the connection attempt getting dropped, and going 
through the dance again.

    
> ITC.close() resets peer msg version, causes connection thrashing in ec2 
> during upgrade
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5669
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5669
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.5
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: gossip
>             Fix For: 1.2.6, 2.0 beta 1
>
>         Attachments: 5669-v1.diff
>
>
> While debugging the upgrading scenario described in CASSANDRA-5660, I 
> discovered the ITC.close() will reset the message protocol version of a peer 
> node that disconnects. CASSANDRA-5660 has a full description of the upgrade 
> path, but basically the Ec2MultiRegionSnitch will close connections on the 
> publicIP addr to reconnect on the privateIp, and this causes ITC to drop the 
> message protocol version of previously known nodes. I think we want to hang 
> onto that version so that when the newer node (re-)connects to the lower node 
> version, it passes the correct protocol version rather than the current 
> version (too high for the older node),the connection attempt getting dropped, 
> and going through the dance again.
> To clarify, the 'thrashing' is at a rather low volume, from what I observed. 
> Anecdotaly, perhaps one connection per second gets turned over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to