[jira] [Commented] (CASSANDRA-6246) EPaxos

sankalp kohli (JIRA) Tue, 11 Nov 2014 22:50:29 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207741#comment-14207741
 ]


sankalp kohli commented on CASSANDRA-6246:
------------------------------------------

"5) ExecutionSorter.getOrder(). Here if condition uncommitted.size() == 0 is 
always true. Also loadedScc is empty as we don't insert into it.
ids are being put into uncommitted in the addInstance method, so it won’t 
always equal 0, good catch on the loadedScc though. I’ll get that fixed."
We only call ExecutionSorter.getOrder() in the else of 
executionSorter.uncommitted.size() > 0 in ExecuteTask.run(). So we can remove 
the check. 



"Missing instances are sent both ways. When a node responds to a preaccept 
message, if it believes the leader is missing an instance, it will include it 
in it's response. Once the leader has received all the responses, if it thinks 
any of the replicas are missing instances, it will send them along."
I think there is not need to send them. Since we are sending all the 
dependencies of the endpoint in the response to the leader, leader can do the 
diff. There is no point sending duplicate information over the wire. So I think 
in PreacceptVerbHandler, we don't need to calculate and send the missing 
instances. 

"Speaking of which, the default of not waiting for an fsync before considering 
a write successful is a more serious problem for paxos/epaxos, since a paxos 
node forgetting it's state can cause inconsistencies."
 I agree we can tackle this later. But here it is more dangerous because once 
an endpoint is out of sync, no further updates can be applied as condition 
checks are local. In current paxos, if a machine is in this situation and could 
not apply the commit. The next commit will still be applied as condition checks 
are at quorum level. 





> EPaxos
> ------
>
>                 Key: CASSANDRA-6246
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Blake Eggleston
>            Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6246) EPaxos

Reply via email to