[ 
https://issues.apache.org/jira/browse/CASSANDRA-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216632#comment-13216632
 ] 

Peter Schuller commented on CASSANDRA-3960:
-------------------------------------------

(Patch will be coming.)
                
> reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3960
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3960
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>            Priority: Minor
>
> The fix committed for CASSANDRA-3626 is incorrect for bootstrapping nodes. 
> The saved endpoint states include only those fully joined in the ring, so 
> when a node is restarted, all nodes that are in joining state will be "new" 
> from the perspective of that node. The FD report() call in the Gossiper bumps 
> the node into UP. This can be negative because it causes requests to be 
> queued up to the node, which is potentially significant (e.g. GC pressure due 
> to promotion into old-gen).
> In the case I saw this in production the node was *in fact* up so it was okay 
> (but it later got kicked down due to computational complexity issues and 
> gossip stage being backed up on start-up, which is how I realized this could 
> be a problem).
> Since the impact is limited to affecting writes (since joining nodes don't 
> serve reads), the negative effects should hopefully be limited to uselessly 
> queueing up a bunch of messages and confusing operators. So, the issue seems 
> minor right now.
> In addition, we currently drop joining nodes away from our notion of the ring 
> very quickly (see the discussion in CASSANDRA-3895) so the time period during 
> which this behavior has any impact at all should be small in modern Cassandra 
> (assuming the code to avoid re-popping up dropped nodes works). My 
> observations have still been on the 0.x branch. However, with CASSANDRA-3892 
> fixed in the future we can no longer be dropping state about joining nodes 
> and the impact window is higher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to