[ 
https://issues.apache.org/jira/browse/CASSANDRA-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14739:
-------------------------------------
    Description: 
If two nodes are bootstrapped at the same time that own adjacent portions of 
the ring (i.e. always for nodes), they will not receive the correct data for 
pending writes (and perhaps not for streaming - TBC)

By default, we don't "permit" multiple nodes to bootstrap at once, but:
  
 # The logic we use to prevent this itself isn’t strongly consistent (or 
atomically applied).  If two nodes start bootstrapping close together in time, 
or simply get divergent gossip state, they can both believe there is no other 
node bootstrapping and proceed.
 # The bug doesn’t require two nodes to _actually_ bootstrap at the same time, 
there only needs to be divergent gossip state on a coordinator, so that the 
coordinator _believes_ there are multiple bootstrapping, even though one of 
them may have completed, and they never overlapped in reality.
 # We can bootstrap and remove nodes concurrently, I think?  I’m pretty sure 
this can also be unsafe, but needs some more thought.

  was:
If two nodes are bootstrapped at the same time that own adjacent portions of 
the ring (i.e. always for nodes), they will not receive the correct data for 
pending writes (and perhaps not for streaming - TBC)

By default, we don't "permit" multiple nodes to bootstrap at once, but:
  
 # The logic we use to prevent this itself isn’t strongly consistent (or 
atomically applied).  If two nodes start bootstrapping close together in time, 
or simply get divergent gossip state, they can both believe there is no other 
node bootstrapping and proceed.
 #  The bug doesn’t require two nodes to _actually_ bootstrap at the same time, 
there only needs to be divergent gossip state on a coordinator, so that the 
coordinator _believes_ there are multiple bootstrapping, even though one of 
them may have completed, and they never overlapped in reality.
 # We can bootstrap and remove nodes concurrently, I think?  I’m pretty sure 
this can also be unsafe, but needs some more thought.


> calculatePendingRanges when multiple concurrent range movements is unsafe
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14739
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14739
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Benedict
>            Priority: Major
>              Labels: correctness
>
> If two nodes are bootstrapped at the same time that own adjacent portions of 
> the ring (i.e. always for nodes), they will not receive the correct data for 
> pending writes (and perhaps not for streaming - TBC)
> By default, we don't "permit" multiple nodes to bootstrap at once, but:
>   
>  # The logic we use to prevent this itself isn’t strongly consistent (or 
> atomically applied).  If two nodes start bootstrapping close together in 
> time, or simply get divergent gossip state, they can both believe there is no 
> other node bootstrapping and proceed.
>  # The bug doesn’t require two nodes to _actually_ bootstrap at the same 
> time, there only needs to be divergent gossip state on a coordinator, so that 
> the coordinator _believes_ there are multiple bootstrapping, even though one 
> of them may have completed, and they never overlapped in reality.
>  # We can bootstrap and remove nodes concurrently, I think?  I’m pretty sure 
> this can also be unsafe, but needs some more thought.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to