Hi all, after looking through the mesos source code for a while, here are some of my initial thoughts.
There seem to be at least two issues that can be tackled separately: - Communication between mesos daemons over the network - Communication in and out of containers when using network isolation Having the first one would already be valuable for installations that don't use network isolation, so I'll focus on this for now. If a mesos master daemon runs on say mesos-master.example.org:5050, and this host has both A and AAAA addresses configured it seems to be desirable that slaves can communicate with this node over both IPv4 and IPv6, depending on their own capabilities. >From the client perspective, the problem is solved by the "Happy Eyeballs" algorithm, i.e. trying both possibilities and using the one where it is possible to connect. The only complication is that address resolution should probably be delayed until we actually want to connect, to avoid spurious failures. On the server side it is a bit more subtle, since the server has to decide which address it should bind its listening socket to. Some possibilities would be: 1) Do nothing special, just bind to the address that was specified 2) Allow specifying multiple listen IP's 3) Allow to specify a network interface and port and open two separate listening sockets for IPv4 and IPv6 These are not mutually exclusive. It seems that (2) and (3) would be desirable anyways, since they would also enable running on hosts with multiple network interfaces. It is however worth noting that (1) already gets us quite far without changing the assumption that there is a single IP associated to a mesos daemon: If an IPv4 address is specified, things will work the same as before, and if there is an IPv6 address specified it will by default accept connections from both IPv4 and IPv6 sources. This behaviour can even be changed at system-level, if not desired. (via /proc/sys/net/ipv6/bindv6only, or the mac/windows equivalent). So, tl;dr: I believe a lot of of useful progress could already be achieved by a relatively small patch series, that: - Fills in the blanks in stout's net:IP, and gives all functions which take an explicit "family"-argument a default value of AF_UNSPEC - Updates the IPv4-specific parts in libprocess (in particular, the parsing of IP literals in URL strings and the constructor Socket::create(Kind, Option<int> fd), which should probably be split into e.g. Socket::make(sa_family_t, Kind) and Socket::wrap(int fd, Kind)) - Changes all calling sites to use the new functions The protobuf IPC format doesn't seem to require any changes, since the only IPv4-dependent field (MasterInfo.ip) was already deprecated in 0.24.0. After this, the next step would then be to look at network isolation and enabling communication in and out of containers. Thoughts? Comments? Am I missing something? Best regards, Benno
