----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21783/#review43654 -----------------------------------------------------------
Ship it! src/zookeeper/zookeeper.cpp <https://reviews.apache.org/r/21783/#comment77939> It is kinda bad that an invalid hostname would take up to 'timeout' (10s) for a FATAL error. Too bad the return code of getaddrs() in ZK C client is not exposed to zookeeper_init(). But oh well, I can't figure out if there is a better way. - Vinod Kone On May 21, 2014, 6:52 p.m., Ben Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/21783/ > ----------------------------------------------------------- > > (Updated May 21, 2014, 6:52 p.m.) > > > Review request for mesos, Vinod Kone and Jiang Yan Xu. > > > Bugs: MESOS-1326 > https://issues.apache.org/jira/browse/MESOS-1326 > > > Repository: mesos-git > > > Description > ------- > > Often during temporary DNS failovers or outages, we see slaves aborting in > zookeeper_init. > > In many cases, the slave can restart within 10 seconds. Since retrying > zookeeper_init should be safe to do, this is an attempt to minimize the > number of unnecessary aborts in the slave. > > > Diffs > ----- > > src/zookeeper/zookeeper.cpp 11029be89bd184dbefe103c84239c1c6b03e3e10 > > Diff: https://reviews.apache.org/r/21783/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Ben Mahler > >
