Aaron, thank you! Your message was exactly what we wanted to see: that we didn't miss something critical. We'll share our Astyanax patch in the future.
On 10 September 2012 03:44, aaron morton <aa...@thelastpickle.com> wrote: > In general we want to achieve strong consistency. > > You need to have R + W > N > > LOCAL_QUORUM and reads with ONE. > > Gives you 2 + 1 > 2 when you use it. When you drop back to ONE / ONE you > no longer have strong consistency. > > may be advise on how to improve it. > > Sounds like you know how to improve it :) > > Things you could play with: > > * hinted_handoff_throttle_delay_in_ms in YAML to reduce the time it takes > for HH delay to deliver the messages. > * increase the read_repair_chance for the CF's. This will increase the > chance of RR repairing an inconsistency behind the scenes, so the next read > is consistent. This will also increase the IO load on the system. > > With the RF 2 restriction you are probably doing the best you can. You are > giving up consistency for availability and partition tolerance. The best > thing to do to get peeps to agree that "we will accept reduced consistency > for high availability" rather than say "in general we want to achieve > strong consistency". > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 9/09/2012, at 9:09 PM, Sergey Tryuber <stryu...@gmail.com> wrote: > > Hi > > We have to use Cassandra with RF=2 (don't ask why...). There are two > datacenters (RF=2 in each datacenter). Also we use Astyanax as a client > library. In general we want to achieve strong consistency. Read performance > is important for us, that's why we perform writes with LOCAL_QUORUM and > reads with ONE. If one server is down, we automatically switch to > Writes.ONE, Reads.ONE only for that replica which has failed node (we > modified Astyanax to achieve that). When the server comes back, we turn > back Writes.LOCAL_QUORUM and Reads.ONE, but, of course, we see some > inconsistencies during the switching process and some time after (when > hinted handnoff works). > > Basically I don't have any questions, just want to share our "ugly" > failover algorithm, to hear your criticism and may be advise on how to > improve it. Unfortunately we can't change replication factor and most of > the time we have to read with consistency level ONE (because we have strict > requirements on read performance). > > Thank you! > > >