Mesos passes the list inside between "zk://" and the first "/" directly into Zookeeper's C bindings. I'm not familiar enough with the Zookeeper API to say for certain, but it looks like this *does* support your round-robin scheme. You can double check here: https://github.com/apache/zookeeper/blob/release-3.4.8/src/c/src/zookeeper.c#L620-L650 <https://github.com/apache/zookeeper/blob/release-3.4.8/src/c/src/zookeeper.c#L777>
As for your other questions: 1) When the ZK connection is lost, Mesos will re-resolve the address as of MESOS-4546 (https://issues.apache.org/jira/browse/MESOS-4546). 2) Looks like ZK rotates between servers. When one server returns an error, it tries the next one, in a circle. Increment code: https://github.com/apache/zookeeper/blob/release-3.4.8/src/c/src/zookeeper.c#L1248 Connection code: https://github.com/apache/zookeeper/blob/release-3.4.8/src/c/src/zookeeper.c#L1578-L1580 On Wed, May 25, 2016 at 3:50 PM, Zhitao Li <[email protected]> wrote: > Hi, > > Can someone confirm whether the zookeeper library Mesos is using well > supports round robin DNS? > > For example, if I have a round robin DNS entry `zookeeper-mesos-dc` which > resolves to five A records, would the --zk flag value ` > zk://zookeeper-mesos-dc:2181/mesos` work on master, agents and any > framework using libmesos driver? > > Also: > 1. What happens if one of the A records changes in the DNS record? Do I > need to restart related Mesos processes? > 2. What happens if one of the A records is not responsive (e.g. underlying > zookeeper server is dead)? Is the zookeeper library capable to avoid > the bad server? > > Some pointer for me to find out the answer individually is also greatly > appreciated. > > Thanks! > > > -- > Cheers, > > Zhitao Li >
