[ https://issues.apache.org/jira/browse/MESOS-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701016#comment-14701016 ]
Gastón Kleiman commented on MESOS-3280: --------------------------------------- Another set of logs were added to the Chronos issue: https://github.com/mesos/chronos/issues/511#issuecomment-131993588 In this new case, the initial Mesos Master fails to fetch the replicated log even before the network partition. > Master fails to access replicated log after network partition > ------------------------------------------------------------- > > Key: MESOS-3280 > URL: https://issues.apache.org/jira/browse/MESOS-3280 > Project: Mesos > Issue Type: Bug > Components: master > Affects Versions: 0.23.0 > Environment: Zookeeper version 3.4.5--1 > Reporter: Bernd Mathiske > Labels: mesosphere > > In a 5 node cluster with 3 masters and 2 slaves, and ZK on each node, when a > network partition is forced, all the masters apparently lose access to their > replicated log. The leading master halts. Unknown reasons, but presumably > related to replicated log access. The others fail to recover from the > replicated log. Unknown reasons. This could have to do with ZK setup, but it > might also be a Mesos bug. > This was observed in a Chronos test drive scenario described in detail here: > https://github.com/mesos/chronos/issues/511 > With setup instructions here: > https://github.com/mesos/chronos/issues/508 -- This message was sent by Atlassian JIRA (v6.3.4#6332)