[
https://issues.apache.org/jira/browse/MESOS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler updated MESOS-770:
----------------------------------
Component/s: replicated log
> Rate control and randomization of Replicated Log catching-up
> ------------------------------------------------------------
>
> Key: MESOS-770
> URL: https://issues.apache.org/jira/browse/MESOS-770
> Project: Mesos
> Issue Type: Improvement
> Components: replicated log
> Reporter: Yan Xu
>
> When the log is catching up either in the process of recovering or after
> coordinator failover the Paxos protocol is run on multiple positions
> (possibly the entire log) concurrently. Too much concurrency could have
> negative impact on the network and the problem may be exacerbated by the
> contention among between multiple recovering replicas and the coordinator.
> Rate control helps limit the number of concurrent positions a proposer
> (recoverer or coordinator) seeks consensus at a time. We can batch a number
> of positions each time.
> Randomly picking the positions in each batch reduces the possibility that
> multiple proposers contend for the same position at the same time which
> causes conflict and retries.
--
This message was sent by Atlassian JIRA
(v6.1#6144)