[
https://issues.apache.org/jira/browse/CASSANDRA-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeremy Hanna updated CASSANDRA-6023:
------------------------------------
Labels: LWT (was: )
> CAS should distinguish promised and accepted ballots
> ----------------------------------------------------
>
> Key: CASSANDRA-6023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6023
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Priority: Major
> Labels: LWT
> Fix For: 2.0.1
>
> Attachments:
> 0001-Distinguish-between-promised-and-accepted-ballots.txt,
> 0002-Populate-commitsByReplica-in-PrepareCallback.txt
>
>
> Currently, we only keep 1) the most recent promise we've made and 2) the last
> update we've accepted. But we don't keep the ballot at which that last update
> was accepted. And because a node always promise to newer ballot, this means
> an already committed update can be replayed even after another update has
> been committed. Re-committing a value is fine, but only as long as we've not
> start a new round yet.
> Concretely, we can have the following case (with 3 nodes A, B and C) with the
> current implementation:
> * A proposer P1 prepare and propose a value X at ballot t1. It is accepted by
> all nodes.
> * A proposer P2 propose at t2 (wanting to commit a new value Y). If say A and
> B receive the commit of P1 before the propose of P2 but C receives those in
> the reverse order, we'll current have the following states:
> {noformat}
> A: in-progress = (t2, _), mrc = (t1, X)
> B: in-progress = (t2, _), mrc = (t1, X)
> C: in-progress = (t2, X), mrc = (t1, X)
> {noformat}
> Because C has received the t1 commit after promising t2, it won't have
> removed X during t1 commit (but note that the problem is not during commit,
> that example still stand if C never receive any commit message).
> * Now, based on the promise of A and B, P2 will propose Y at t2 (C don't see
> this propose in particular, not before he promise on t3 below at least). A
> and B accepts, P2 will send a commit for Y.
> * In the meantime a proposer P3 submit a prepare at t3 (for some other
> irrelevant value) which reaches C before it receives P2 propose&commit. That
> prepare reaches A and B too, but after the P2 commit. At that point the state
> will be:
> {noformat}
> A: in-progress = (t3, _), mrc = (t2, Y)
> B: in-progress = (t3, _), mrc = (t2, Y)
> C: in-progress = (t3, X), mrc = (t2, Y)
> {noformat}
> In particular, C still has X as update because each time it got a commit, it
> has promised to a more recent ballot and thus skipped the delete. The value
> is still X because it has received the P2 propose after having promised t3
> and has thus refused it.
> * P3 gets back the promise of say C and A. Both response has t3 as
> in-progress ballot (and it is more recent than any mrc) but C comes with
> value X. So P3 will replay X. Assuming no more contention this replay will
> succeed and X will be committed at t3.
> At the end of that example, we've comitted X, Y and then X again, even though
> only P1 has ever proposed X.
> I believe the correct fix is to keep the ballot of when an update is accepted
> (instead of using the most recent promised ballot). That way, in the example
> above, P3 would receive from C a promise on t3, but would know that X was
> accepted at t1. And so P3 would be able to ignore X since the mrc of A will
> tell him it's an obsolete value.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]