[
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159620#comment-14159620
]
Blake Eggleston commented on CASSANDRA-6246:
--------------------------------------------
I’ve been thinking through how epaxos would be affected by repair, read repair,
and hints.
Since both the read and write parts of an epaxos instance are executed both
locally and asynchronously, it’s possible that a repair could write the result
of an instance to a node before that instance is executed on that node. This
would cause the decision of an epaxos instance to be different on the node
being repaired, which could create an inconsistency between nodes. Although
it’s difficult to imagine an instance taking more time to execute than a
repair, I don’t think it’s impossible, and would introduce inconsistencies
during normal operation.
Something that would be more likely to cause problems would be someone
performing a quorum read on a key that has instances in flight, and triggering
a read repair on that key. Hints would have a similar problem, but it would
also mean that people are mixing serialized and unserialized writes
concurrently.
Having the node sending the repair message include some metadata about the most
recent executed instance(s) it's aware of is the best solution I've come up
with so far. If the receiving node is behind, it could work with the sending
node to catch up before performing the repair.
> EPaxos
> ------
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Blake Eggleston
> Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is
> that Multi-paxos requires leader election and hence, a period of
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos,
> (2) is particularly useful across multiple datacenters, and (3) allows any
> node to act as coordinator:
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to
> implement it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)