[ 
https://issues.apache.org/jira/browse/CASSANDRA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102254#comment-13102254
 ] 

Marcus Eriksson commented on CASSANDRA-3166:
--------------------------------------------

Minimal patch attached

Clear version in IncomingTcpConnection instead since that is the one setting it;
before we could end up in a state where the outgoing connections got closed, 
but the incoming one was still up, meaning the version was reset and it was 
never possible to get the version set again.

Now it is the IncomingTcpConnections responsibility to keep track of versions, 
if that one is closed, we are bound to get a new incoming connection and 
therefor set the version correctly



> Rolling upgrades from 0.7 to 0.8 not possible
> ---------------------------------------------
>
>                 Key: CASSANDRA-3166
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7.5, 0.7.9, 0.8.4
>            Reporter: Marcus Eriksson
>             Fix For: 0.8.4
>
>
> We are in the progress of upgrading to 0.8 and we need to do a rolling 
> upgrade, this fails miserably and it is reproducible;
> 1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
> 2. upgrade one of the nodes (i upped a seednode, not sure if that is 
> important)
> 3. continue reading/writing
> 4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received 
> connection from newer protocol version. Ignorning message.
> it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me 
> think that it matters if it is the 0.8 node connecting to the 0.7 nodes or 
> the other way round.
> Debug logging on the 0.8 node shows:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09 
> 11:55:06,067 StorageProxy.java (line 178) Write timeout 
> java.util.concurrent.TimeoutException for one (or more) of: 
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09 
> 11:55:06,067 StorageProxy.java (line 584) Read timeout: 
> java.util.concurrent.TimeoutException: Operation timed out - received only 1 
> responses from /193.182.3.92,  .
> nothing except for the "newer protocol version..." in the 0.7-logs
> i will continue to look at this issue but if anyone has a quick patch, let me 
> know

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to