On 19.05.15 10:04, Sylvain Lebresne wrote:
On Tue, May 19, 2015 at 9:42 AM, DuyHai Doan <doanduy...@gmail.com <mailto:doanduy...@gmail.com>> wrote:

    If RF > 1, the consistency level at QUORUM cannot guarantee strict
    isolation (for normal mutation or batch). If you look at this
    slide:
    
http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/25,
    you can see that the mutation is sent by the coordinator, in
    parallel, to all replicas.

     Now it is very possible that due to network latency, the mutation
    is applied on the first replica and is applied with "some delay"
    (which can be at the order of microseconds) on other replicas.

     Theoretically, one client can read updated value on first replica
    and old value on the other replicas, even at QUORUM.


Unfortunately different people will tend to have different definitions of isolation and I don't seem to have the same definition than you but still, I don't understand what you're talking about. Of course replicas might not get a mutation at the same time, and yes, a read at QUORUM may thus not see the most up to date value from all replicas.

If I understand correctly, this can only happen if a QUORUM read started *before* the QUORUM write completed. If, on the other hand, a QUORUM read follows a *completed* QUORUM write, shouldn't the read always return the most recent value?

For example, with RF = 3 and QUORUM write + read, we have nodes_written + nodes_read > RF (with nodes_written = nodes_read = 2) which guarantees consistency, or am I missing something?

But the coordinator resolves all responses together and return only the most recent one, so that doesn't matter to the client and I don't see how that has anything to do with isolation from the client perspective.

+1


My response to the original question is that if by isolation you mean "can a reader observe a write only partially applied", then for single partition writes, Cassandra do offer isolation.

Yes, this is exactly what I mean (and what I need for batch writes to a single partition).

One caveat however is that if 2 writes conflits, they are resolved using their timestamp and if the timestamp are the same, resolution is based on values, which is not necessarily intuitive and may make it sound like the writes where not applied in isolation (even though technically they are), see https://issues.apache.org/jira/browse/CASSANDRA-6123 for details on that later problem. I'll note that my definition of isolation does not mean you can't read stale data, and you can indeed if you use weak consistency levels.

I completely share your view/definition of isolation - it's not about staleness, it's only about that a reader cannot observe partial writes.

Regarding staleness/consistency, if I want to read the most recent batch-write a QUORUM read must follow the completed QUORUM (batch) write, right?

Thanks for your clarifications,
Martin


If you mean something else by isolation, then I think agreeing first on the definition would be wise.

--
Sylvain

--
Martin Krasser

blog:    http://krasserm.github.io
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

Reply via email to