[
https://issues.apache.org/jira/browse/MESOS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942694#comment-14942694
]
Shuai Lin commented on MESOS-1806:
----------------------------------
Currently when etcd server dies, mesos master/slaves would die with it:
{code}
Failed to watch for candidacy: Exhaustively tried all etcd servers; giving up
Failed to detect the leading master: Exhaustively tried all etcd servers;
giving up; committing suicide!
Failed to watch for candidacy: Exhaustively tried all etcd servers; giving up
{code}
Comparing this to the situation when zookeeper dies:
{code}
I1005 00:03:46.300608 313495552 group.cpp:436] Lost connection to ZooKeeper,
attempting to reconnect ...
I1005 00:03:46.300655 312958976 group.cpp:436] Lost connection to ZooKeeper,
attempting to reconnect ...
I1005 00:03:46.300855 311885824 group.cpp:436] Lost connection to ZooKeeper,
attempting to reconnect ...
I1005 00:04:37.711350 161116160 group.cpp:331] Group process
(group(4)@127.0.0.1:5050) reconnected to ZooKeeper
{code}
I'll implement the re-connection mechanism in the etcd code.
> Substituting etcd for Zookeeper
> -------------------------------
>
> Key: MESOS-1806
> URL: https://issues.apache.org/jira/browse/MESOS-1806
> Project: Mesos
> Issue Type: Task
> Components: leader election
> Reporter: Ed Ropple
> Assignee: Shuai Lin
> Priority: Minor
>
> <adam_mesos> eropple: Could you also file a new JIRA for Mesos to drop ZK
> in favor of etcd or ReplicatedLog? Would love to get some momentum going on
> that one.
> --
> Consider it filed. =)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)