Hey everyone, First off; this is a long email... so brace yourself. I appreciate your time and patience.
I wanted to open up discussion around this, since I've spoken to several people in isolation and feel it would be great to come to some kind of resolution. My aim here is to try and improve the new Docker Containerizer and help to support new use cases that are currently not catered for. I get the impression Docker + Mesos use cases fall into two groups. There's the group of users that wish to say "I want to run a docker container on S slave with R resources", which in almost every case I have come across is for long running services. The other group is more along the lines of, "I want to user docker instead of cgroup isolation directly". There are currently two mechanisms to achieve these things, the External Containerizer (0.19.0) and the Docker Containerizer (0.20.0), both of which allow you to run docker containers in some way or another. - The external containerizer implementation inside mesos knows nothing of Docker and any Docker specific responsibilities are handed out to another subprocess (Deimos or alike). - The docker containerizer is a native implementation of Docker built in to mesos, and requires zero additional dependencies or configuration (except for Docker itself). Each containerizer allows a framework to send additional details with a `TaskInfo` protobuf to configure the underlying Docker container that's going to be launched. - The external containerizer allows the user to specify an image to use (`ContainerInfo.image`) if they choose - The docker containerizer requires the user to also specify an image to use (`ContainerInfo.DockerInfo.image`) Now here's the difference... - The external containerizer allows the user to specify a list of "args" which currently equate to CLI arguments to the `docker run` invocation. - The external containerizer will be responsible for managing isolation for all executors. - The external containerizer is still given the opportunity to act even if no image or arguments for the container are provided. - The docker containerizer has a set of explicit, structured options that define the container, e.g `ContainerInfo.volumes` and `Volume` protobufs (though there is a limited set of supported options, these are growing over time) - The docker containerizer will *only* isolate tasks that include a `ContainerInfo.type == ContainerInfo.Type.DOCKER` and pass on anything else @tnachen/@benh (and others involved) have taken the approach to abstract the container options to a subset of structured values, to separate frameworks from the docker command line tool and the arguments it accepts. Maybe in the future Mesos may use the Docker Remote API instead of a subprocess to `docker run`, and this allows such a change to be transparent. This is really great imo. Back to the use cases... > "I want to run a docker container on S slave with R resources" This is covered really well with the docker containerizer. A framework can choose which docker image is used, specify options in an abstract manner, and Mesos is going to handle the rest. The user must specify an image to run, but that's OK because the user knows they want to use a specific image, and that they want to use Docker. > "I want to user docker instead of cgroup isolation directly" This use case isn't supported at all by the docker containerizer. From a system administrators perspective, Docker can be a great tool to completely isolate processes from one another and more importantly from the host. It allows different dependencies to be installed and used for different processes, and means host machines (mesos slaves) can be very pure. If the person deploying Mesos is separated enough from those using it, the administrator might not want to enforce that every user using any framework *must* supply `ContainerInfo` details, especially if they're launching tasks that require very common things (e.g python27). These common tools may or may not be installed on the host. My motivation here is to reduce the amount users or framework developers need to understand about how the Mesos cluster is put together (and the specifics of what is going on behind the scenes) and let them focus on trying out frameworks or building their own. Essentially what this means is, the ability to launch Mesos tasks and executors inside docker containers transparently to the user. We've taken this approach with the External Containerizer and our docker containerizer implementation, and has proved to work very well for just "getting started" on our cluster. The system administrators can sleep peacefully knowing although anyone can run pretty much what they like on the cluster, slaves aren't going to slowflake or become damaged. The `--default_container_image` command line option supported by the `mesos-slave` process was very useful here, but I can see how that explicit approach can cause problems once you introduce the concept of multiple containerizers. ----------------------------------------------------------- Given this background, I have a few questions. - Can the docker containerizer support more friendly defaults? If I only want my mesos cluster to containerizer things with Docker, but don't wish to require every user specify an image for their tasks. - Since all of this Docker work has made it's way into the mesos core, a lot of implicit decisions have been made about what options to support, how to expose them to users, and how the workflow looks for frameworks and users. I think this is pretty limiting, and given Mesos is designed to be the fundamental building blocks for your datacenter, building for specific workflows concerns me. Is finding solutions to all of these workflows really something the mesos-core team should focus on? - I think the external containerizer is a value asset to Mesos even without Docker. There have been a few questions on the mailing list about using other types of isolation (on windows for example). Could more work be done to unity configuration for each built-in containerizer? I feel like lots of different command line options might be a little confusing over time. I'm very interested to hear what others think, and please do correct me if I'm wrong in anything I've said. Looking forward to the discussion. Tom.
