[
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023151#comment-13023151
]
Peter Schuller commented on CASSANDRA-2494:
-------------------------------------------
I don't think anyone is claiming otherwise, unless I'm misunderstanding. The
problem is that while the "if sucessfully written to quorum, subsequent quorum
reads will see it" guarantee is indeed maintained, it is possible for quorum
reads to see data go backwards (on a timeline) in the event of a *failed*
attempted quorum write. This includes the possibility of reads seeing data that
then permanently vanishes, even though you only lost say 1 node that you
designed your cluster for surviving (RF >= 3, QUORUM). ("lost 1 node" can be
substituted with "killed 1 node in periodic commit mode")
I still don't think this is a violation of what was promised, but I can see how
making the further guarantee would make for more useful consistency semantics
in some cases.
With respect to implicit write: An alternative is to adjust reconciliation
logic when applied as part of reads (as opposed to AES, hinted hand-off,
writes) to take consistency level into account and only consider columns whose
timestamp is >= the greatest timestamp that has quorum (off the top of my head
I think that should be correct in call cases, but I didn't think this through
terribly).
> Quorum reads are not consistent
> -------------------------------
>
> Key: CASSANDRA-2494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sean Bridges
>
> As discussed in this thread,
> http://www.mail-archive.com/[email protected]/msg12421.html
> Quorum reads should be consistent. Assume we have a cluster of 3 nodes
> (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but
> not Y and Z, then a read from X should not return N unless the read is
> committed to at least two nodes. To ensure this, a read from X should wait
> for an ack of the read repair write from either Y or Z before returning.
> Are there system tests for cassandra? If so, there should be a test similar
> to the original post in the email thread. One thread should write 1,2,3...
> at consistency level ONE. Another thread should read at consistency level
> QUORUM from a random host, and verify that each read is >= the last read.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira