On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <[email protected]> wrote:
> Given regional bare metal Mesos clusters on multiple continents, are there > any known issues running some of the agents over the WAN? Is anyone else > doing it, or is this a terrible idea that I should tell management no on? > > A few specifics: > > 1. Are there any known limitations or configuration gotchas I might > encounter? > One thing to keep in mind is that the masters maintain a distributed log through a consensus protocol, so there needs to be a quorum of masters that can talk to each other in order to operate. Consensus protocols tend to be very latency-sensitive, so you probably want to keep masters near each other. Some of our clusters span semi-wide geographical regions (in production, up to about 5 milliseconds RTT between master and some slaves). So far, we haven't seen any issues caused by that amount of latency, and I believe we have clusters in non-production environments which have even higher round trip between slaves and masters, and work fine. I haven't benchmarked task launch time or anything like that, so I can't say how much it affects the speed of operations. Mesos generally does the right thing around network partitions (changes won't propagate, but it won't kill your tasks), but if you're running things in Marathon and using TCP or HTTP healthchecks, be aware that Marathon does not rate limit itself on issuing task kills <https://github.com/mesosphere/marathon/issues/3317> for healthcheck failures. This means during a network partition, your applications will be fine, but once the network partition heals (or if you're experiencing packet loss but not total failure), Marathon will suddenly kill all of the tasks on the far side of the partition. A workaround for that is to use command health checks, which are run by the mesos slave. > 2. Does setting up ZK observers in each non-primary dc and pointing the > agents at them exclusively make sense? > My understanding of ZK observers is that they proxy writes to the actual ZK quorum members, so this would probably be fine. mesos-slave uses ZK to discover masters, and mesos-master uses ZK to do leader election; only mesos-master is doing any writes to ZK. I'm not sure how often mesos-slave reads from ZK to get the list of masters; I assume it doesn't bother if it has a live connection to a master. > 4. Any suggestions on how best to do agent attributes / constraints for > something like this? I was planning on having the config management add a > "data_center" agent attribute to match on. > If you're running services on Marathon or similar, I'd definitely recommend exposing the location of the slaves as an attribute, and having constraints to keep different instances of your application spread across the different locations. The "correct" constraints to apply depends on your application and latency / failure sensitivity. Evan > Thanks! > > [1] > https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation > > -- > Jeff Schroeder > > Don't drink and derive, alcohol and analysis don't mix. > http://www.digitalprognosis.com >

