[
https://issues.apache.org/jira/browse/CASSANDRA-10969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083749#comment-15083749
]
Joel Knighton commented on CASSANDRA-10969:
-------------------------------------------
Your observations on this ticket and your comment on [CASSANDRA-8113] are
correct; we should handle the possibility of a legitimately long-running
cluster properly. In my opinion, the current behavior is a bug, and I'll work
on a fix.
You are also correct that a rolling restart should fix this because a
generation of 0 (as after a restart) is special-cased in the check introduced.
> long-running cluster sees bad gossip generation when a node restarts
> --------------------------------------------------------------------
>
> Key: CASSANDRA-10969
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10969
> Project: Cassandra
> Issue Type: Bug
> Components: Coordination
> Environment: 4-node Cassandra 2.1.1 cluster, each node running on a
> Linux 2.6.32-431.20.3.dl6.x86_64 VM
> Reporter: T. David Hudson
> Assignee: Joel Knighton
> Priority: Minor
>
> One of the nodes in a long-running Cassandra 2.1.1 cluster (not under my
> control) restarted. The remaining nodes are logging errors like this:
> "received an invalid gossip generation for peer xxx.xxx.xxx.xxx; local
> generation = 1414613355, received generation = 1450978722"
> The gap between the local and received generation numbers exceeds the
> one-year threshold added for CASSANDRA-8113. The system clocks are
> up-to-date for all nodes.
> If this is a bug, the latest released Gossiper.java code in 2.1.x, 2.2.x, and
> 3.0.x seems not to have changed the behavior that I'm seeing.
> I presume that restarting the remaining nodes will clear up the problem,
> whence the minor priority.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)