This is great info Evan, especially coming from a production experience.
Thanks for sharing it !

On Thu, Mar 31, 2016 at 1:49 PM, Evan Krall <[email protected]> wrote:

> On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <
> [email protected]> wrote:
>
>> Given regional bare metal Mesos clusters on multiple continents, are
>> there any known issues running some of the agents over the WAN? Is anyone
>> else doing it, or is this a terrible idea that I should tell management no
>> on?
>>
>> A few specifics:
>>
>> 1. Are there any known limitations or configuration gotchas I might
>> encounter?
>>
>
> One thing to keep in mind is that the masters maintain a distributed log
> through a consensus protocol, so there needs to be a quorum of masters that
> can talk to each other in order to operate. Consensus protocols tend to be
> very latency-sensitive, so you probably want to keep masters near each
> other.
>
> Some of our clusters span semi-wide geographical regions (in production,
> up to about 5 milliseconds RTT between master and some slaves). So far, we
> haven't seen any issues caused by that amount of latency, and I believe we
> have clusters in non-production environments which have even higher round
> trip between slaves and masters, and work fine. I haven't benchmarked task
> launch time or anything like that, so I can't say how much it affects the
> speed of operations.
>
> Mesos generally does the right thing around network partitions (changes
> won't propagate, but it won't kill your tasks), but if you're running
> things in Marathon and using TCP or HTTP healthchecks, be aware that
> Marathon does not rate limit itself on issuing task kills
> <https://github.com/mesosphere/marathon/issues/3317> for healthcheck
> failures. This means during a network partition, your applications will be
> fine, but once the network partition heals (or if you're experiencing
> packet loss but not total failure), Marathon will suddenly kill all of the
> tasks on the far side of the partition. A workaround for that is to use
> command health checks, which are run by the mesos slave.
>
>
>> 2. Does setting up ZK observers in each non-primary dc and pointing the
>> agents at them exclusively make sense?
>>
>
> My understanding of ZK observers is that they proxy writes to the actual
> ZK quorum members, so this would probably be fine. mesos-slave uses ZK to
> discover masters, and mesos-master uses ZK to do leader election; only
> mesos-master is doing any writes to ZK.
>
> I'm not sure how often mesos-slave reads from ZK to get the list of
> masters; I assume it doesn't bother if it has a live connection to a master.
>
>
>> 4. Any suggestions on how best to do agent attributes / constraints for
>> something like this? I was planning on having the config management add a
>> "data_center" agent attribute to match on.
>>
>
> If you're running services on Marathon or similar, I'd definitely
> recommend exposing the location of the slaves as an attribute, and having
> constraints to keep different instances of your application spread across
> the different locations. The "correct" constraints to apply depends on your
> application and latency / failure sensitivity.
>
> Evan
>
>
>> Thanks!
>>
>> [1]
>> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>>
>> --
>> Jeff Schroeder
>>
>> Don't drink and derive, alcohol and analysis don't mix.
>> http://www.digitalprognosis.com
>>
>
>

Reply via email to