Andrew Sibley created CASSANDRA-11204:
-----------------------------------------
Summary: Add support for EACH_ONE write consistency level
Key: CASSANDRA-11204
URL: https://issues.apache.org/jira/browse/CASSANDRA-11204
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Andrew Sibley
I have seen CASSANDRA-4412 where this was originally proposed, and infact a
patch is attached which includes many of the changes I thought would be
necessary (perhaps this was never merged because it was not tested or
something?)
Anyway, CASSANDRA-4412 seems to have been closed as the raiser of the ticket
could not justify a use case, but I have a use case for EACH_ONE:
I have a 2 Data Centre setup, each with 3 nodes, and replication set to 3 per
DC (ie. every node has all rows of my data). My app is write heavy, and writes
need to be fast, so I have a relatively low consistency level (QUORUM is too
slow, so I'm using TWO at the moment). I would like to have strong consistency
guarantees, but also tolerate a DC outage. My applications run in the same 2
DCs as Cassandra, meaning if a DC fails, some of my apps are almost certain to
fail as well and apps will need to read data from Cassandra (eg. apps being
moved over to the other DC, or other instances in the other DC loading more
data into memory).
This means that it is possible (likely) that an app will be writing some data,
this will be written to the local DC. If that DC fails, so will the app, and
the latest writes will have only been written to the local DC, and may not have
replicated to the remote DC yet (and now they won't until the local DC is alive
again). The app will now be started up in the remote DC - it will startup, try
to read from ALL (fail), then QUORUM (fail), then THREE (pass). But the THREE
will only be able to read from the DC which is still alive, which likely
doesn't have the latest writes.
I know I cannot achieve absolute consistency with 1 DC down (3/6 nodes are
down), but I would like to try as best I can to guarantee a high level of
consistency even during a DC outage, as the data is highly critical.
However, if I had EACH_ONE available, I could guarantee that both DCs have at
least one node with the latest data (through EACH_ONE write level). Then I
could try reading at ALL (fail if a DC is down), then QUORUM (fail if a DC is
down), then THREE (pass). THREE in this scenario is still OK, as I know both
DCs have at least one copy of the latest data.
Obviously there are other scenarios which this doesn't help (my app ends up
writing to one node in each DC, but both of those nodes fail before either
manages to replicate to any other nodes), but I think this is much less likely
as the DCs are independent. An entire DC outage will definitely happen at some
stage though (power outage, networking issue, etc).
In summary, I think EACH_ONE write level would generally be beneficial to those
with a small number of DCs who want to ensure every DC has some data, to
protect critical data against DC outages.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)