[
https://issues.apache.org/jira/browse/CASSANDRA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101259#comment-13101259
]
Peter Schuller commented on CASSANDRA-3166:
-------------------------------------------
Removing the resetVersion() did not help. I added some logging to
IncomingTcpConnection and it seems that when the 0.8 node goes up first, the
0.7 node never tries to make an outgoing connection to it.
If my understanding is correct, from reading CASSANDRA-2818 and looking at the
code, I think the intent is that we discover the version of the other guy
whenever that guy connects to *us*; we can never find out that the other side
has a mis-matched version based on activity on the outbound connection.
So, incoming connections would be a necessity in order for the 0.8 node to ever
adjust it's lingo.
> Rolling upgrades from 0.7 to 0.8 not possible
> ---------------------------------------------
>
> Key: CASSANDRA-3166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.5, 0.7.9, 0.8.4
> Reporter: Marcus Eriksson
>
> We are in the progress of upgrading to 0.8 and we need to do a rolling
> upgrade, this fails miserably and it is reproducible;
> 1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
> 2. upgrade one of the nodes (i upped a seednode, not sure if that is
> important)
> 3. continue reading/writing
> 4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received
> connection from newer protocol version. Ignorning message.
> it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me
> think that it matters if it is the 0.8 node connecting to the 0.7 nodes or
> the other way round.
> Debug logging on the 0.8 node shows:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09
> 11:55:06,067 StorageProxy.java (line 178) Write timeout
> java.util.concurrent.TimeoutException for one (or more) of:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09
> 11:55:06,067 StorageProxy.java (line 584) Read timeout:
> java.util.concurrent.TimeoutException: Operation timed out - received only 1
> responses from /193.182.3.92, .
> nothing except for the "newer protocol version..." in the 0.7-logs
> i will continue to look at this issue but if anyone has a quick patch, let me
> know
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira