[
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049326#comment-13049326
]
Brandon Williams commented on CASSANDRA-2768:
---------------------------------------------
bq. Since Sasha has reportedly verified that all node report being on 0.8.0,
this suggests a Gossiper bug that reports the wrong version (even after node
restarts).
Gossiper's setVersion and getVersion are fairly straightforward, and setVersion
is called in IncomingTcpConnection and setting it to whatever the remote node
said to, so a bug here looks unlikely. The version information is not
persisted anywhere, so the remote node has to be indicating it is 0.7. I was
unable to reproduce following Sasha's steps, so I think the most likely
explanation here is that a node is mistakenly still on 0.7.
{quote}
I am seeing this error on two of the nodes:
ERROR [pool-2-thread-14] 2011-06-14 23:33:40,544 CustomTThreadPoolServer.java
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in
readMessageBegin, old client?
{quote}
This is indicative of a client-side thrift compatibility problem and is
unrelated, as thrift is not used for internode communication.
> AntiEntropyService excluding nodes that are on version 0.7 or sooner
> --------------------------------------------------------------------
>
> Key: CASSANDRA-2768
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.0
> Environment: 4 node environment --
> Originally 0.7.6-2 with a Keyspace defined with RF=3
> Upgraded all nodes ( 1 at a time ) to version 0.8.0: For each node, the node
> was shut down, new version was turned on, using the existing data files /
> directories and a nodetool repair was run.
> Reporter: Sasha Dolgy
> Assignee: Brandon Williams
>
> When I run nodetool repair on any of the nodes, the
> /var/log/cassandra/system.log reports errors similar to:
> INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13
> 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from
> repair because it is on version 0.7 or sooner. You should consider updating
> this node before running repair again.
> ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13
> 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in
> thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI
> Runtime]
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
> at java.util.HashMap$KeyIterator.next(HashMap.java:828)
> at
> org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
> at
> org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
> The INFO message and subsequent ERROR message are logged for 2 nodes .. I
> suspect that this is because RF=3.
> nodetool ring shows that all nodes are up.
> Client connections (read / write) are not having issues..
> nodetool version on all nodes shows that each node is 0.8.0
> At suggestion of some contributors, I have restarted each node and tried to
> run a nodetool repair again ... the result is the same with the messages
> being logged.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira