[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588390#comment-13588390
 ] 

Sergio Bossa commented on CASSANDRA-5062:
-----------------------------------------

Thanks for clarifying, [~slebresne].

{quote}what happens if when the coordinator sends the commits to replicas, but 
only a minority of replicas get that commit (say 1 of 3 replica got it (and 
persist it), the two other dies between the prepare and commit phase). And 
later on, the 2 replica get back up while the 3rd one now dies, and we do a new 
CAS (that would have a majority and so should work).{quote}

The Zab deviation from standard 2PC here is that the coordinator doesn't need 
to wait for the ack from replicas on commit phase.
If a replica fails during prepare phase, it will just be out of quorum.
If a replica fails after prepare but before completing the commit, it will 
recover later from the leader: so in your example, when 2 and 3 come up, they 
will join the leader which may hint them the correct values.
If the third replica died in your example was actually the coordinator, a new 
coordinator will be elected among the ones that have seen either the last 
commit or the latest *proposed* commit, which will become committed.
So there's no lost-ack problem as there's actually no ack at all in the commit 
phase: it will be "eventually" committed or recovered.

By the way, I'm not saying this is better than Paxos for sure: I just *think* 
this is easier and more practical (which yes doesn't mean can be implemented 
easily on top of Cassandra).
                
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to