[jira] [Commented] (CASSANDRA-6246) EPaxos

Blake Eggleston (JIRA) Sun, 05 Oct 2014 11:32:18 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159620#comment-14159620
 ]


Blake Eggleston commented on CASSANDRA-6246:
--------------------------------------------

I’ve been thinking through how epaxos would be affected by repair, read repair, 
and hints.

Since both the read and write parts of an epaxos instance are executed both 
locally and asynchronously, it’s possible that a repair could write the result 
of an instance to a node before that instance is executed on that node. This 
would cause the decision of an epaxos instance to be different on the node 
being repaired, which could create an inconsistency between nodes. Although 
it’s difficult to imagine an instance taking more time to execute than a 
repair, I don’t think it’s impossible, and would introduce inconsistencies 
during normal operation.

Something that would be more likely to cause problems would be someone 
performing a quorum read on a key that has instances in flight, and triggering 
a read repair on that key. Hints would have a similar problem, but it would 
also mean that people are mixing serialized and unserialized writes 
concurrently.

Having the node sending the repair message include some metadata about the most 
recent executed instance(s) it's aware of is the best solution I've come up 
with so far. If the receiving node is behind, it could work with the sending 
node to catch up before performing the repair.

> EPaxos
> ------
>
>                 Key: CASSANDRA-6246
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Blake Eggleston
>            Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6246) EPaxos

Reply via email to