[jira] [Comment Edited] (CASSANDRA-5062) Support CAS

Jonathan Ellis (JIRA) Wed, 27 Feb 2013 06:53:15 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588400#comment-13588400
 ]


Jonathan Ellis edited comment on CASSANDRA-5062 at 2/27/13 2:51 PM:
--------------------------------------------------------------------

bq. probably the coordinator should hint something when he don't get the 
commit-ack from the 2 replicas that died

This is racy, though; if the coordinator also dies, then we still lose.

FWIW, Spinnaker's solution is actually pretty dicey here too: the leader does 
2PC, and if the leader does not get a majority of acks back to it's proposal, 
it will return fail the op.  But, it doesn't actually abort or revert the 
proposal on the followers.  (And if it tried, it would still be open to a race, 
where it fails before aborting, leaving some proposals extant.)

Then, when a new leader is elected, it replays the proposals it has not yet 
committed.  So a proposal that originally failed, and was returned as such to 
the client, could end up committed  after failover.  Which is, at best, 
unexpected, and in the CAS case I'm pretty sure is outright broken.

I think Sergio's proposal has a similar problem: if the leader reports success 
to the client after local commit, but before it has been committed to the 
followers, we could either (1) lose the commit on failover if followers are 
pessimistic, or (2) commit data that we originally reported failed as in 
Spinnaker if we are optimistic.  On the other hand if the leader tries to wait 
for commit ack from followers before reporting to the client it could block 
indefinitely during a partition, so that is no solution either.
                
      was (Author: jbellis):
    bq. probably the coordinator should hint something when he don't get the 
commit-ack from the 2 replicas that died

This is racy, though; if the coordinator also dies, then we still lose.

FWIW, Spinnaker's solution is actually pretty dicey here too: the leader does 
2PC, and if the leader does not get a majority of acks back to it's proposal, 
it will return fail the op.  But, it doesn't actually abort or revert the 
proposal on the followers.  (And if it tried, it would still be open to a race, 
where it fails before aborting, leaving some proposals extant.)

Then, when a new leader is elected, it replays the proposals it has not yet 
committed.  So a proposal that originally failed, and was returned as such to 
the client, could succeed after failover.

I think Sergio's proposal has a similar problem: if the leader reports success 
to the client after local commit, but before it has been committed to the 
followers, we could either (1) lose the commit on failover if followers are 
pessimistic, or (2) commit data that we originally reported failed as in 
Spinnaker if we are optimistic.  On the other hand if the leader tries to wait 
for commit ack from followers before reporting to the client it could block 
indefinitely during a partition, so that is no solution either.
                  
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-5062) Support CAS

Reply via email to