On 19.05.15 10:04, Sylvain Lebresne wrote:
On Tue, May 19, 2015 at 9:42 AM, DuyHai Doan <doanduy...@gmail.com
<mailto:doanduy...@gmail.com>> wrote:
If RF > 1, the consistency level at QUORUM cannot guarantee strict
isolation (for normal mutation or batch). If you look at this
slide:
http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/25,
you can see that the mutation is sent by the coordinator, in
parallel, to all replicas.
Now it is very possible that due to network latency, the mutation
is applied on the first replica and is applied with "some delay"
(which can be at the order of microseconds) on other replicas.
Theoretically, one client can read updated value on first replica
and old value on the other replicas, even at QUORUM.
Unfortunately different people will tend to have different definitions
of isolation and I don't seem to have the same definition than you but
still, I don't understand what you're talking about. Of course
replicas might not get a mutation at the same time, and yes, a read at
QUORUM may thus not see the most up to date value from all replicas.
If I understand correctly, this can only happen if a QUORUM read started
*before* the QUORUM write completed. If, on the other hand, a QUORUM
read follows a *completed* QUORUM write, shouldn't the read always
return the most recent value?
For example, with RF = 3 and QUORUM write + read, we have nodes_written
+ nodes_read > RF (with nodes_written = nodes_read = 2) which guarantees
consistency, or am I missing something?
But the coordinator resolves all responses together and return only
the most recent one, so that doesn't matter to the client and I don't
see how that has anything to do with isolation from the client
perspective.
+1
My response to the original question is that if by isolation you mean
"can a reader observe a write only partially applied", then for single
partition writes, Cassandra do offer isolation.
Yes, this is exactly what I mean (and what I need for batch writes to a
single partition).
One caveat however is that if 2 writes conflits, they are resolved
using their timestamp and if the timestamp are the same, resolution is
based on values, which is not necessarily intuitive and may make it
sound like the writes where not applied in isolation (even though
technically they are), see
https://issues.apache.org/jira/browse/CASSANDRA-6123 for details on
that later problem. I'll note that my definition of isolation does not
mean you can't read stale data, and you can indeed if you use weak
consistency levels.
I completely share your view/definition of isolation - it's not about
staleness, it's only about that a reader cannot observe partial writes.
Regarding staleness/consistency, if I want to read the most recent
batch-write a QUORUM read must follow the completed QUORUM (batch)
write, right?
Thanks for your clarifications,
Martin
If you mean something else by isolation, then I think agreeing first
on the definition would be wise.
--
Sylvain
--
Martin Krasser
blog: http://krasserm.github.io
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz