sankalp kohli created CASSANDRA-12126:
-----------------------------------------

             Summary: CAS Reads Inconsistencies 
                 Key: CASSANDRA-12126
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
             Project: Cassandra
          Issue Type: Bug
            Reporter: sankalp kohli


While looking at the CAS code in Cassandra, I found a potential issue with CAS 
Reads. Here is how it can happen with RF=3

1) You issue a CAS Write and it fails in the propose phase. A machine replies 
true to a propose and saves the commit in accepted filed. The other two 
machines B and C does not get to the accept phase. 

Current state is that machine A has this commit in paxos table as accepted but 
not committed and B and C does not. 

2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
value written in step 1. This step is as if nothing is inflight. 

3) Issue another CAS Read and it goes to A and B. Now we will discover that 
there is something inflight from A and will propose and commit it with the 
current ballot. Now we can read the value written in step 1 as part of this CAS 
read.

If we skip step 3 and instead run step 4, we will never learn about value 
written in step 1. 

4. Issue a CAS Write and it involves only B and C. This will succeed and commit 
a different value than step 1. Step 1 value will never be seen again and was 
never seen before. 



If you read the Lamport “paxos made simple” paper and read section 2.3. It 
talks about this issue which is how learners can find out if majority of the 
acceptors have accepted the proposal. 

In step 3, it is correct that we propose the value again since we dont know if 
it was accepted by majority of acceptors. When we ask majority of acceptors, 
and more than one acceptors but not majority has something in flight, we have 
no way of knowing if it is accepted by majority of acceptors. So this behavior 
is correct. 

However we need to fix step 2, since it caused reads to not be linearizable 
with respect to writes and other reads. In this case, we know that majority of 
acceptors have no inflight commit which means we have majority that nothing was 
accepted by majority. I think we should run a propose step here with empty 
commit and that will cause write written in step 1 to not be visible ever 
after. 

With this fix, we will either see data written in step 1 on next serial read or 
will never see it which is what we want. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to