Benedict updated CASSANDRA-14740:
    Status: Patch Available  (was: Open)

The basic approach is quite simple: when we repair, we build a {{WritePlan}}, 
but we only select those nodes we need to meet the consistency level of the 
operation we are performing, and we only consider live nodes.  We prefer those 
nodes we have read from.  If they are all present, and they are sufficient to 
meet consistency, we behave as before.  In any other scenario, we build a 
partition representing all differences we have seen, and propagate this to any 
node that wasn't one of the original targets.

There are some minor inefficiencies, such as not handling the case where the 
ownership of only one node has changed, and there is no node pending, in which 
case we _might_ be able to only propagate the difference found on reconciling 
the presumably replaced node (though if unsafe bootstrap occurred even this 
would not be acceptable, but this might be an acceptable consistency failure 
given the semantic guarantees of this).  

We also don't bother to avoid merging the complete diff row with any other 
pending repairs if we have to perform an additional write.  It's assume that 
these scenario are rare, and not worth the significant extra complexity.

Unfortunately fixing unit tests was painful and not super beautiful.  This is 
because read-repair now consults the ring to decide who a repair should be 
routed to, instead of assuming it is sufficient to write to those we read from. 
 The tests as written assume the ring can be empty, and also that replication 
factor isn't relevant, so to avoid completely rewriting the tests, I have done 
some ugly things.

> BlockingReadRepair does not maintain monotonicity during range movements
> ------------------------------------------------------------------------
>                 Key: CASSANDRA-14740
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14740
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Coordination
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Urgent
>              Labels: correctness
>             Fix For: 4.0
> The BlockingReadRepair code introduced by CASSANDRA-10726 requires that each 
> of the queried nodes are written to, but pending nodes are not considered.  
> If there is a pending range movement, one of these writes may be ‘lost’ when 
> the range movement completes.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to