Re: Cluster temporarily split into segments

aaron morton Sun, 26 Aug 2012 15:06:27 -0700

> using CL=ONE (read) and CL=ALL(write).
Using this setting you are saying the application should fail in the case of a 
network partition. You are valuing Consistency and Availability over  Partition 
Tolerance. Mixing the CL levels in response to a partition will make it 
difficult to reason about the consistency of the data. Consider other 
approaches such as CL QUORUM.


While using CL ONE for read looks good. If you require strong consistency you 
have to use ALL for writes. QUORUM for reads and writes may be a better choice. 
Using RF 3 and 6 nodes would give you a pretty good availability in the face of 
node failures (for background http://thelastpickle.com/2011/06/13/Down-For-Me/ 
) 

Or relax the Consistency and us CL ONE for reads and CL QUORUM for writes. The 
writes are still sent to RF nodes. But we can no longer guarantee reads will 
see them. 

If the high RF is for 
> Suppose that connectivity breaks down (for whatever reason) causing two 
> isolated segments:
> S1 = {A,B,C,D} and S2 = {E,F}.

Do clients still have access to the entire cluster ? Normally we would expect 
clients to try different nodes until they either fail or find a partition with 
enough UP nodes to service the request.

> to be able to write at all, the CL strategy definitely needs to be changed.
> In S1, for instance change to CL=QUORUM for both reads/writes
> In S2, CL(write) change to TWO/ONE/ANY. CL(read) may be changed to TWO
Whatever the choice you can imagine a partition where the only thing that works 
for writes is CL ONE. e.g. if it split 3/3 QUOURM would not work. 

> So now to the interesting question, what happens when S1 and S2 reestablish 
> full connectivity again ?
If you are using CL ALL for writes the easiest things to do is stop writing 
when the cluster partitions. And resume when it comes back. 

If you drop the CL during writes reads will be inconsistent until either HH has 
finished or you run repairs. 


>  It is extremly important that reads will continue to operate in both S1 and 
> S2
if it's important that reads continue and are Consistent, I would look at RF 3 
with QUOURM / QUOURM. 

If it's important that reads continue and consistency can be relaxed I would 
look at RF 3 (or 6) and read ONE write QUOURM

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/08/2012, at 11:00 PM, Robert Hellmans <robert.hellm...@aastra.com> wrote:

> Hi !
>  
> I'm preparing the test below. I've found a lot of information about deadnode 
> replacements and adding extra nodes to increase capacity, but didn't find 
> anything about this segementation issue. Anyone that can share 
> experience/ideas ?
>  
>  
> Setup:
> Cluster with 6 nodes {A,B,C,D,E,F}, RF=6, using CL=ONE (read) and 
> CL=ALL(write).
>  
>  
> Suppose that connectivity breaks down (for whatever reason) causing two 
> isolated segments:
> S1 = {A,B,C,D} and S2 = {E,F}.
>  
> Cluster connectivity anomalities will be detected by all nodes in this setup, 
> so clients in S1 and S2 can be advised
> to change their CL strategy. It is extremly important that reads will 
> continue to operate in both S1 and S2
> and I don't see any reason why they shouldn't. It is almost that important 
> that writes in each segment can continue, but
> to be able to write at all, the CL strategy definitely needs to be changed.
> In S1, for instance change to CL=QUORUM for both reads/writes
> In S2, CL(write) change to TWO/ONE/ANY. CL(read) may be changed to TWO
>  
> During the connectivity breakdown, clients in both S1 and S2 simultaneously 
> change/add/delete data.
>  
>  
>  
> So now to the interesting question, what happens when S1 and S2 reestablish 
> full connectivity again ?
> Again, the re-connectivity event will be detected, so should I trig some 
> special repair sequence ?
> Or should I've been doing some actions already when the connectivity broke ?
> What about connectivity dropout time, longer/shorter than max_hint_window ?
>  
>  
>  
>  
> Rds /Robert
>  
>  
>

Re: Cluster temporarily split into segments

Reply via email to