[ 
https://issues.apache.org/jira/browse/MESOS-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005071#comment-14005071
 ] 

Benjamin Mahler commented on MESOS-1326:
----------------------------------------

To see if retries were safe, I had the code do an infinite loop on 
zookeeper_init, I see the following check failure:

{noformat}
F0521 18:25:34.214886 35247 group.cpp:318] Check failed: state == CONNECTING (4 
vs. 1)
{noformat}

> Retry policy for zookeeper_init failures
> ----------------------------------------
>
>                 Key: MESOS-1326
>                 URL: https://issues.apache.org/jira/browse/MESOS-1326
>             Project: Mesos
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Jie Yu
>            Assignee: Benjamin Mahler
>              Labels: reliability
>
> Currently, we fatal when we have a zookeeper_init failure. Sometimes, this is 
> annoying because during a DNS failover, we may experience this a lot and we 
> don't necessary need to fatal on those cases.
> I am wondering whether we can retry on zookeeper_init failures?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to