[
https://issues.apache.org/jira/browse/CASSANDRA-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381949#comment-16381949
]
Kenneth Brotman edited comment on CASSANDRA-14271 at 3/1/18 4:31 PM:
---------------------------------------------------------------------
In the Dynamo paper see section 4.6
was (Author: kenbrotman):
>From the Dynamo paper:
4.6 Handling Failures: Hinted Handoff If Dynamo used a traditional quorum
approach it would be unavailable during server failures and network partitions,
and would have reduced durability even under the simplest of failure
conditions. To remedy this it does not enforce strict quorum membership and
instead it uses a “sloppy quorum”; all read and write operations are performed
on the first N healthy nodes from the preference list, which may not always be
the first N nodes encountered while walking the consistent hashing ring.
Consider the example of Dynamo configuration given in Figure 2 with N=3. In
this example, if node A is temporarily down or unreachable during a write
operation then a replica that would normally have lived on A will now be sent
to node D. This is done to maintain the desired availability and durability
guarantees. The replica sent to D will have a hint in its metadata that
suggests which node was the intended recipient of the replica (in this case A).
Nodes that receive hinted replicas will keep them in a separate local database
that is scanned periodically. Upon detecting that A has recovered, D will
attempt to deliver the replica to A. Once the transfer succeeds, D may delete
the object from its local store without decreasing the total number of replicas
in the system. Using hinted handoff, Dynamo ensures that the read and write
operations are not failed due to temporary node or network failures.
Applications that need the highest level of availability can set W to 1, which
ensures that a write is accepted as long as a single node in the system has
durably written the key it to its local store. Thus, the write request is only
rejected if all nodes in the system are unavailable. However, in practice, most
Amazon services in production set a higher W to meet the desired level of
durability. A more detailed discussion of configuring N, R and W follows in
section 6. It is imperative that a highly available storage system be capable
of handling the failure of an entire data center(s). Data center failures
happen due to power outages, cooling failures, network failures, and natural
disasters. Dynamo is configured such that each object is replicated across
multiple data centers. In essence, the preference list of a key is constructed
such that the storage nodes are spread across multiple data centers. These
datacenters are connected through high speed network links. This scheme of
replicating across multiple datacenters allows us to handle entire data center
failures without a data outage. 4
> The Hints web page in the web site is empty
> -------------------------------------------
>
> Key: CASSANDRA-14271
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14271
> Project: Cassandra
> Issue Type: Improvement
> Components: Documentation and Website
> Reporter: Kenneth Brotman
> Priority: Major
>
> [http://cassandra.apache.org/doc/latest/operating/hints.html]
> is empty. Please contribute content. Myself or someone else will take it
> from there.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]