[ 
https://issues.apache.org/jira/browse/CASSANDRA-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964836#comment-13964836
 ] 

Robert Coli commented on CASSANDRA-6961:
----------------------------------------

While shortening the staleness race when reading at ONE is cool, I am most 
excited that this ticket provides an alternative to the previous operational 
best practice of restoring a node that was down for a long time by 
re-bootstrapping it. Briefly, this is because re-bootstrapping it decreases 
unique copies of the data, whereas this approach maintains the original data 
and replica sets. It is my view that we should aim to maintain the unique copy 
of data on any given replica, as much as feasible.

> nodes should go into hibernate when join_ring is false
> ------------------------------------------------------
>
>                 Key: CASSANDRA-6961
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6961
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>             Fix For: 2.0.7, 2.1 beta2
>
>         Attachments: 6961.txt
>
>
> The impetus here is this: a node that was down for some period and comes back 
> can serve stale information.  We know from CASSANDRA-768 that we can't just 
> wait for hints, and know that tangentially related CASSANDRA-3569 prevents us 
> from having the node in a down (from the FD's POV) state handle streaming.
> We can *almost* set join_ring to false, then repair, and then join the ring 
> to narrow the window (actually, you can do this and everything succeeds 
> because the node doesn't know it's a member yet, which is probably a bit of a 
> bug.)  If instead we modified this to put the node in hibernate, like 
> replace_address does, it could work almost like replace, except you could run 
> a repair (manually) while in the hibernate state, and then flip to normal 
> when it's done.
> This won't prevent the staleness 100%, but it will greatly reduce the chance 
> if the node has been down a significant amount of time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to