Hey everyone,

First off; this is a long email... so brace yourself. I appreciate your
time and patience.

I wanted to open up discussion around this, since I've spoken to several
people in isolation and feel it would be great to come to some kind of
resolution. My aim here is to try and improve the new Docker Containerizer
and help to support new use cases that are currently not catered for.

I get the impression Docker + Mesos use cases fall into two groups. There's
the group of users that wish to say "I want to run a docker container on S
slave with R resources", which in almost every case I have come across is
for long running services. The other group is more along the lines of, "I
want to user docker instead of cgroup isolation directly".

There are currently two mechanisms to achieve these things, the External
Containerizer (0.19.0) and the Docker Containerizer (0.20.0), both of which
allow you to run docker containers in some way or another.

- The external containerizer implementation inside mesos knows nothing of
Docker and any Docker specific responsibilities are handed out to another
subprocess (Deimos or alike).
- The docker containerizer is a native implementation of Docker built in to
mesos, and requires zero additional dependencies or configuration (except
for Docker itself).

Each containerizer allows a framework to send additional details with a
`TaskInfo` protobuf to configure the underlying Docker container that's
going to be launched.

- The external containerizer allows the user to specify an image to use
(`ContainerInfo.image`) if they choose
- The docker containerizer requires the user to also specify an image to
use (`ContainerInfo.DockerInfo.image`)

Now here's the difference...

- The external containerizer allows the user to specify a list of "args"
which currently equate to CLI arguments to the `docker run` invocation.
- The external containerizer will be responsible for managing isolation for
all executors.
- The external containerizer is still given the opportunity to act even if
no image or arguments for the container are provided.

- The docker containerizer has a set of explicit, structured options that
define the container, e.g `ContainerInfo.volumes` and `Volume` protobufs
(though there is a limited set of supported options, these are growing over
time)
- The docker containerizer will *only* isolate tasks that include a
`ContainerInfo.type == ContainerInfo.Type.DOCKER` and pass on anything else

@tnachen/@benh (and others involved) have taken the approach to abstract
the container options to a subset of structured values, to separate
frameworks from the docker command line tool and the arguments it accepts.
Maybe in the future Mesos may use the Docker Remote API instead of a
subprocess to `docker run`, and this allows such a change to be
transparent. This is really great imo.

Back to the use cases...

> "I want to run a docker container on S slave with R resources"

This is covered really well with the docker containerizer. A framework can
choose which docker image is used, specify options in an abstract manner,
and Mesos is going to handle the rest. The user must specify an image to
run, but that's OK because the user knows they want to use a specific
image, and that they want to use Docker.

> "I want to user docker instead of cgroup isolation directly"

This use case isn't supported at all by the docker containerizer. From a
system administrators perspective, Docker can be a great tool to completely
isolate processes from one another and more importantly from the host. It
allows different dependencies to be installed and used for different
processes, and means host machines (mesos slaves) can be very pure. If the
person deploying Mesos is separated enough from those using it, the
administrator might not want to enforce that every user using any framework
*must* supply `ContainerInfo` details, especially if they're launching
tasks that require very common things (e.g python27). These common tools
may or may not be installed on the host.

My motivation here is to reduce the amount users or framework developers
need to understand about how the Mesos cluster is put together (and the
specifics of what is going on behind the scenes) and let them focus on
trying out frameworks or building their own.

Essentially what this means is, the ability to launch Mesos tasks and
executors inside docker containers transparently to the user. We've taken
this approach with the External Containerizer and our docker containerizer
implementation, and has proved to work very well for just "getting started"
on our cluster. The system administrators can sleep peacefully knowing
although anyone can run pretty much what they like on the cluster, slaves
aren't going to slowflake or become damaged. The
`--default_container_image` command line option supported by the
`mesos-slave` process was very useful here, but I can see how that explicit
approach can cause problems once you introduce the concept of multiple
containerizers.

-----------------------------------------------------------

Given this background, I have a few questions.

- Can the docker containerizer support more friendly defaults? If I only
want my mesos cluster to containerizer things with Docker, but don't wish
to require every user specify an image for their tasks.

- Since all of this Docker work has made it's way into the mesos core, a
lot of implicit decisions have been made about what options to support, how
to expose them to users, and how the workflow looks for frameworks and
users. I think this is pretty limiting, and given Mesos is designed to be
the fundamental building blocks for your datacenter, building for specific
workflows concerns me. Is finding solutions to all of these workflows
really something the mesos-core team should focus on?

- I think the external containerizer is a value asset to Mesos even without
Docker. There have been a few questions on the mailing list about using
other types of isolation (on windows for example). Could more work be done
to unity configuration for each built-in containerizer? I feel like lots of
different command line options might be a little confusing over time.

I'm very interested to hear what others think, and please do correct me if
I'm wrong in anything I've said. Looking forward to the discussion.

Tom.

Reply via email to