Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

AJ Thu, 16 Jun 2011 19:18:59 -0700

UPDATE to my suggestion is below.



On 6/16/2011 5:50 PM, Ryan King wrote:

On Thu, Jun 16, 2011 at 2:12 PM, AJ<a...@dude.podzone.net>  wrote:

On 6/16/2011 2:37 PM, Ryan King wrote:

On Thu, Jun 16, 2011 at 1:05 PM, AJ<a...@dude.podzone.net>    wrote:
<snip>

The Cassandra consistency model is pretty elegant and this type of
approach breaks that elegance in many ways. It would also only really be
useful when the value has a high probability of being updated between a
node
going down and the value being read.

I'm not sure what you mean.  A node can be down for days during which
time
the value can be updated.  The intention is to use the nodes available
even
if they fall below the RF.  If there is only 1 node available for
accepting
a replica, that should be enough given the conditions I stated and
updated
below.

If this is your constraint, then you should just use CL.ONE.

My constraint is a CL = "All Available".  So, CL.ONE will not work.

That's a solution, not a requirement. What's your requirement?

Ok. And this updates my suggestion removing the need for ALL_AVAIL.This adds logic to cope with unavailable nodes and still achieveconsistency for a specific situation.

The general requirement is to completely eliminate read failures forreads specifying CL = ALL for values that have been subject to aspecific data update pattern. The specific data update pattern consistsof a value that has been updated (or added) in the face of one or more,but less than R, unavailable replica nodes (at least 1 replica node isavailable). If a particular data value (column value) is updated afterthe latest down node, this implies this new value is independent of anyreplica values that are currently unavailable. Therefore, in thissituation, the number of available replicas is irrelevant. Afterquerying all *available* replica nodes, the value with the latesttimestamp is consistent if that timestamp is > the timestamp of the lastreplica node that became unavailable.



<snip>

Well, theoretically, of course; that's the nature of distributed systems.
  But, Cass does indeed make that determination when it counts the number
available replica nodes before it decides if enough replica nodes are
available.  But, this is obvious to you I'm sure so maybe I don't understand
your statement.

Consider this scenario: given nodes, A, B and C and A thinks C is down
but B thinks C is up. What do you do? Remember, A doesn't know that B
thinks C is up, it only knows its own state.

What kind of network configuration would have this kind of scenario?This method only applies withing a data center which should be OK sinceother replication across data centers seems to be mostly for faulttolerance... but I will have to think about this.

-ryan

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

Reply via email to