[ https://issues.apache.org/jira/browse/CASSANDRA-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremy Hanna updated CASSANDRA-6023: ------------------------------------ Labels: LWT (was: ) > CAS should distinguish promised and accepted ballots > ---------------------------------------------------- > > Key: CASSANDRA-6023 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6023 > Project: Cassandra > Issue Type: Bug > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Priority: Major > Labels: LWT > Fix For: 2.0.1 > > Attachments: > 0001-Distinguish-between-promised-and-accepted-ballots.txt, > 0002-Populate-commitsByReplica-in-PrepareCallback.txt > > > Currently, we only keep 1) the most recent promise we've made and 2) the last > update we've accepted. But we don't keep the ballot at which that last update > was accepted. And because a node always promise to newer ballot, this means > an already committed update can be replayed even after another update has > been committed. Re-committing a value is fine, but only as long as we've not > start a new round yet. > Concretely, we can have the following case (with 3 nodes A, B and C) with the > current implementation: > * A proposer P1 prepare and propose a value X at ballot t1. It is accepted by > all nodes. > * A proposer P2 propose at t2 (wanting to commit a new value Y). If say A and > B receive the commit of P1 before the propose of P2 but C receives those in > the reverse order, we'll current have the following states: > {noformat} > A: in-progress = (t2, _), mrc = (t1, X) > B: in-progress = (t2, _), mrc = (t1, X) > C: in-progress = (t2, X), mrc = (t1, X) > {noformat} > Because C has received the t1 commit after promising t2, it won't have > removed X during t1 commit (but note that the problem is not during commit, > that example still stand if C never receive any commit message). > * Now, based on the promise of A and B, P2 will propose Y at t2 (C don't see > this propose in particular, not before he promise on t3 below at least). A > and B accepts, P2 will send a commit for Y. > * In the meantime a proposer P3 submit a prepare at t3 (for some other > irrelevant value) which reaches C before it receives P2 propose&commit. That > prepare reaches A and B too, but after the P2 commit. At that point the state > will be: > {noformat} > A: in-progress = (t3, _), mrc = (t2, Y) > B: in-progress = (t3, _), mrc = (t2, Y) > C: in-progress = (t3, X), mrc = (t2, Y) > {noformat} > In particular, C still has X as update because each time it got a commit, it > has promised to a more recent ballot and thus skipped the delete. The value > is still X because it has received the P2 propose after having promised t3 > and has thus refused it. > * P3 gets back the promise of say C and A. Both response has t3 as > in-progress ballot (and it is more recent than any mrc) but C comes with > value X. So P3 will replay X. Assuming no more contention this replay will > succeed and X will be committed at t3. > At the end of that example, we've comitted X, Y and then X again, even though > only P1 has ever proposed X. > I believe the correct fix is to keep the ballot of when an update is accepted > (instead of using the most recent promised ballot). That way, in the example > above, P3 would receive from C a promise on t3, but would know that X was > accepted at t1. And so P3 would be able to ignore X since the mrc of A will > tell him it's an obsolete value. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org