[jira] [Updated] (CASSANDRA-8346) Paxos operation can use stale data during multiple range movements

Sylvain Lebresne (JIRA) Thu, 20 Nov 2014 02:38:58 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-8346:
----------------------------------------
    Attachment: 8346.txt

I don't think there is much fix we can do (reading from the pending endpoints 
could also return stale data since those aren't yet up to date), so I think the 
simplest fix is to throw an UnavailableException if we have more than 2 pending 
endpoints. Attaching patch to do that.

> Paxos operation can use stale data during multiple range movements
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-8346
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8346
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.0.12
>
>         Attachments: 8346.txt
>
>
> Paxos operations correctly account for pending ranges for all operation 
> pertaining to the Paxos state, but those pending ranges are not taken into 
> account when reading the data to check for the conditions or during a serial 
> read. It's thus possible to break the LWT guarantees by reading a stale 
> value.  This require 2 node movements (on the same token range) to be a 
> problem though.
> Basically, we have {{RF}} replicas + {{P}} pending nodes. For the Paxos 
> prepare/propose phases, the number of required participants (the "Paxos 
> QUORUM") is {{(RF + P + 1) / 2}} ({{SP.getPaxosParticipants}}), but the read 
> done to check conditions or for serial reads is done at a "normal" QUORUM (or 
> LOCAL_QUORUM), and so a weaker {{(RF + 1) / 2}}. We have a problem if it's 
> possible that said read can read only from nodes that were not part of the 
> paxos participants, and so we have a problem if:
> {noformat}
> "normal quorum" == (RF + 1) / 2 <= (RF + P) - ((RF + P + 1) / 2) == 
> "participants considered - blocked for"
> {noformat}
> We're good if {{P = 0}} or {{P = 1}} since this inequality gives us 
> respectively {{RF + 1 <= RF - 1}} and {{RF + 1 <= RF}}, both of which are 
> impossible. But at {{P = 2}} (2 pending nodes), this inequality is equivalent 
> to {{RF <= RF}} and so we might read stale data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8346) Paxos operation can use stale data during multiple range movements

Reply via email to