[
https://issues.apache.org/jira/browse/CASSANDRA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ariel Weisberg updated CASSANDRA-13442:
---------------------------------------
Description:
Replication factors like RF=2 can't provide strong consistency and availability
because if a single node is lost it's impossible to reach a quorum of replicas.
Stepping up to RF=3 will allow you to lose a node and still achieve quorum for
reads and writes, but requires committing additional storage.
The requirement of a quorum for writes/reads doesn't seem to be something that
can be relaxed without additional constraints on queries, but it seems like it
should be possible to relax the requirement that 3 full copies of the entire
data set are kept. What is actually required is a covering data set for the
range and we should be able to achieve a covering data set and high
availability without having three full copies.
After a repair we know that some subset of the data set is fully replicated. At
that point we don't have to read from a quorum of nodes for the repaired data.
It is sufficient to read from a single node for the repaired data and a quorum
of nodes for the unrepaired data.
One way to exploit this would be to have N replicas, say the last N replicas
(where N varies with RF) in the preference list, delete all repaired data after
a repair completes. Subsequent quorum reads will be able to retrieve the
repaired data from any of the two full replicas and the unrepaired data from a
quorum read of any replica including the "transient" replicas.
was:
Replication factors like RF=2 can't provide strong consistency and availability
because if a single node is lost it's impossible to reach a quorum of replicas.
Stepping up to RF=3 will allow you to lose a node and still achieve quorum for
reads and writes, but requires committing additional storage.
The requirement of a quorum for writes/reads doesn't seem to be something that
can be relaxed without additional constraints on queries, but it seems like it
should be possible to relax the requirement that 3 full copies of the entire
data set are kept. What is actually required is a covering data set for the
range and we should be able to achieve a covering data set and high
availability without having three full copies.
After a repair we know that some subset of the data set is fully replicated. At
that point we don't have to read from a quorum of nodes for the repaired data.
It is sufficient to read from a single node for the repaired data and a quorum
of nodes for the unrepaired data.
One way to exploit this would be to have N replicas, say the last N replicas in
the preference list, delete all repaired data after a repair completes.
Subsequent quorum reads will be able to retrieve the repaired data from any of
the two full replicas and the unrepaired data from a quorum read of any replica
including the "transient" replicas.
> Support a means of strongly consistent highly available replication with
> storage requirements approximating RF=2
> ----------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-13442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13442
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction, Coordination, Distributed Metadata, Local
> Write-Read Paths
> Reporter: Ariel Weisberg
>
> Replication factors like RF=2 can't provide strong consistency and
> availability because if a single node is lost it's impossible to reach a
> quorum of replicas. Stepping up to RF=3 will allow you to lose a node and
> still achieve quorum for reads and writes, but requires committing additional
> storage.
> The requirement of a quorum for writes/reads doesn't seem to be something
> that can be relaxed without additional constraints on queries, but it seems
> like it should be possible to relax the requirement that 3 full copies of the
> entire data set are kept. What is actually required is a covering data set
> for the range and we should be able to achieve a covering data set and high
> availability without having three full copies.
> After a repair we know that some subset of the data set is fully replicated.
> At that point we don't have to read from a quorum of nodes for the repaired
> data. It is sufficient to read from a single node for the repaired data and a
> quorum of nodes for the unrepaired data.
> One way to exploit this would be to have N replicas, say the last N replicas
> (where N varies with RF) in the preference list, delete all repaired data
> after a repair completes. Subsequent quorum reads will be able to retrieve
> the repaired data from any of the two full replicas and the unrepaired data
> from a quorum read of any replica including the "transient" replicas.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)