Hey,

We've been running mesos slaves across sites, most in a private cloud off
site and using AWS EC2 for extra burst capacity when required, across a
Direct Connect link. We've found this model to work well on the mesos side,
though it's key to understand the interaction between tasks running across
multiple sites at the same time so you're aware of the effects of latency
and throughput (e.g data transfer if you're using Hadoop).

All of our master nodes (and zookeeper quorum) are in the same site (though
not physical location), but this isn't an issue for us since we're not
using AWS as a mechanism for redundancy.

I'm not aware of any built-in resource unit for network latency or
throughput, but I don't see any reason you couldn't specify your own on
each slave and configure frameworks to take that into account when making
scheduling decisions. The recent addition of the network isolator (
http://mesos.apache.org/documentation/latest/network-monitoring/) might
also be of use to you here.

Very interested in what others are doing in this space.

Tom.


On 26 August 2014 09:19, Yaron Rosenbaum <[email protected]> wrote:

> Hi
>
> Here's a crazy idea:
> Is it possible / has anyone tried to run Mesos where the slaves are in
> radically different network zones? For example: A few slaves on Azure, a
> few slaves on AWS, and a bunch of other slaves on premises etc.
>
>    - Assuming it's possible, is it possible to define resource
>    requirements for tasks, in terms of 'access to network resource A with less
>    than X latency and throughput between i and m' for example?
>    - Masters would probably have to be 'close' to each other, to prevent
>    'brain-splits', true or not ?
>       - If so, then how does one assure Master HA ?
>
>
> I've been thinking about this for a while, and can't find a reason 'why
> not'.
>
> Please share your thoughts on the subject.
>
> (Y)
>
>

Reply via email to