[
https://issues.apache.org/jira/browse/CASSANDRA-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216632#comment-13216632
]
Peter Schuller commented on CASSANDRA-3960:
-------------------------------------------
(Patch will be coming.)
> reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
> ----------------------------------------------------------------------------
>
> Key: CASSANDRA-3960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3960
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Peter Schuller
> Assignee: Peter Schuller
> Priority: Minor
>
> The fix committed for CASSANDRA-3626 is incorrect for bootstrapping nodes.
> The saved endpoint states include only those fully joined in the ring, so
> when a node is restarted, all nodes that are in joining state will be "new"
> from the perspective of that node. The FD report() call in the Gossiper bumps
> the node into UP. This can be negative because it causes requests to be
> queued up to the node, which is potentially significant (e.g. GC pressure due
> to promotion into old-gen).
> In the case I saw this in production the node was *in fact* up so it was okay
> (but it later got kicked down due to computational complexity issues and
> gossip stage being backed up on start-up, which is how I realized this could
> be a problem).
> Since the impact is limited to affecting writes (since joining nodes don't
> serve reads), the negative effects should hopefully be limited to uselessly
> queueing up a bunch of messages and confusing operators. So, the issue seems
> minor right now.
> In addition, we currently drop joining nodes away from our notion of the ring
> very quickly (see the discussion in CASSANDRA-3895) so the time period during
> which this behavior has any impact at all should be small in modern Cassandra
> (assuming the code to avoid re-popping up dropped nodes works). My
> observations have still been on the 0.x branch. However, with CASSANDRA-3892
> fixed in the future we can no longer be dropping state about joining nodes
> and the impact window is higher.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira