> On May 21, 2014, 9:55 p.m., Vinod Kone wrote: > > src/zookeeper/zookeeper.cpp, line 90 > > <https://reviews.apache.org/r/21783/diff/1/?file=586740#file586740line90> > > > > It is kinda bad that an invalid hostname would take up to 'timeout' > > (10s) for a FATAL error. Too bad the return code of getaddrs() in ZK C > > client is not exposed to zookeeper_init(). But oh well, I can't figure out > > if there is a better way. > >
Well, we can do a getaddrinfo ourselves in the code:) - Jie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21783/#review43654 ----------------------------------------------------------- On May 21, 2014, 6:52 p.m., Ben Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/21783/ > ----------------------------------------------------------- > > (Updated May 21, 2014, 6:52 p.m.) > > > Review request for mesos, Vinod Kone and Jiang Yan Xu. > > > Bugs: MESOS-1326 > https://issues.apache.org/jira/browse/MESOS-1326 > > > Repository: mesos-git > > > Description > ------- > > Often during temporary DNS failovers or outages, we see slaves aborting in > zookeeper_init. > > In many cases, the slave can restart within 10 seconds. Since retrying > zookeeper_init should be safe to do, this is an attempt to minimize the > number of unnecessary aborts in the slave. > > > Diffs > ----- > > src/zookeeper/zookeeper.cpp 11029be89bd184dbefe103c84239c1c6b03e3e10 > > Diff: https://reviews.apache.org/r/21783/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Ben Mahler > >
