[ 
https://issues.apache.org/jira/browse/CASSANDRA-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587371#comment-13587371
 ] 

Sergio Bossa commented on CASSANDRA-5062:
-----------------------------------------

So many comments, I'll try to address some:

1) Paxos complexity.

I believe Paxos *is* complex in its original form, and ZK two-phase protocol is 
actually a simplification, because election is only employed to establish a 
single proposer and then 2PC-like atomic broadcast is used; if multiple nodes 
can propose at the same time, it means running several Paxos instances with 
several proposers.
Here's also an interesting paper about engineering problems and tradeoffs in 
implementing Paxos: 
http://www.cs.utexas.edu/~lorenzo/corsi/cs380d/papers/paper2-1.pdf

2) Paxos election VS Paxos CAS

Doing direct CAS rounds would be pretty nice, but even with partitioning, my 
doubts about the protocol liveness still stand.

3) Lock-based approaches.

The word "lock" is probably misplaced here, my fault: as it would just be a 
simple leader election based on placing a "lock value" on a column (kind-of 
file locks), and use that to declare a leader; then, monotonically increasing 
numbered CAS rounds would go through the leader, and could be accomplished via 
a simplified 2PC protocol (to avoid lost acks).
If the leader fails, a new one will be elected among those ones with the 
highest committed CAS round number (to overcome partially committed rounds).
Note: now that I write of it, this is basically Zab on top of C*, but I still 
believe it is much cheaper and easier than Paxos. May be wrong obviously :)
                
> Support CAS
> -----------
>
>                 Key: CASSANDRA-5062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5062
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> "Strong" consistency is not enough to prevent race conditions.  The classic 
> example is user account creation: we want to ensure usernames are unique, so 
> we only want to signal account creation success if nobody else has created 
> the account yet.  But naive read-then-write allows clients to race and both 
> think they have a green light to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to