[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592559#comment-13592559
 ] 

Cristian Opris edited comment on CASSANDRA-5062 at 3/4/13 7:52 PM:
-------------------------------------------------------------------

[~slebresne] I have read your pseudo-code, seems pretty much what I was trying 
to describe with the version counter that counts paxos rounds (except I was 
thinking at row level rather than column level)

I noticed however that while the leader's proposal is aborted if it has a stale 
round, the acceptor algorithm does not handle the case when the 
acceptor replica is behind.

Basically in the acceptor algorithm you don't seem to handle the case where 
C_current.timestamp() < R-1

Edit: C_current.timestamp needs to be exactly R-1 if you increment the counter 
on sending the proposal.

One way to do that is to nack the proposal indicating it needs to catch up and 
either expect to receive a "snapshot" from the leader or do a read.

Also note you don't need to send the column values with the proposal. If you 
get quorum for the proposal you can perform the CAS locally and just
send the new column value with the accept

Essentially consensus is on the next column value to write, not the CAS. Since 
proposer is guaranteed to be up to date before sending accept, 
it can do the CAS locally. 

                
      was (Author: [email protected]):
    [~slebresne] I have read your pseudo-code, seems pretty much what I was 
trying to describe with the version counter that counts paxos rounds (except I 
was thinking at row level rather than column level)

I noticed however that while the leader's proposal is aborted if it has a stale 
round, the acceptor algorithm does not handle the case when the 
acceptor replica is behind.

Basically in the acceptor algorithm you don't seem to handle the case where 
C_current.timestamp() < R

One way to do that is to nack the proposal indicating it needs to catch up and 
either expect to receive a "snapshot" from the leader or do a read.

Also note you don't need to send the column values with the proposal. If you 
get quorum for the proposal you can perform the CAS locally and just
send the new column value with the accept

Essentially consensus is on the next column value to write, not the CAS. Since 
proposer is guaranteed to be up to date before sending accept, 
it can do the CAS locally. 

                  
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>         Attachments: half-baked commit 1.jpg, half-baked commit 2.jpg, 
> half-baked commit 3.jpg
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to