Since mesosphere distributes images of mesos software in a container (
https://hub.docker.com/r/mesosphere/mesos-slave/), I decided to try this
option. After trying this with various settings I settled on a
configuration that basically works. But I do see one problem and this is
what this message about.
To start off, I find it strange that the image does not contain docker
distribution itself. After all, in order to use containnerizer=mesos one
needs to point mesos slave at a docker binary. If I bind-mount docker
binary to container's /usr/local/bin/mesos and use option
--mesos=/usr/local/bin/mesos I run into the problem of dynamic library
dependencies: mesos depends on a bunch of dyanmic libraries:
======================
ldd /usr/bin/docker
linux-vdso.so.1 => (0x00007fffaebfe000)
libsystemd-journal.so.0 => /lib/x86_64-linux-gnu/libsystemd-journal.so.0
(0x00007f0a1458b000)
libapparmor.so.1 => /usr/lib/x86_64-linux-gnu/libapparmor.so.1
(0x00007f0a1437f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f0a14160000)
libdevmapper.so.1.02.1 => /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1
(0x00007f0a13f27000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0a13b62000)
... and many more
===========================
Mounting */lib/x86_64-linux-gnu/ *in docker is a horrible idea which is not
worth discussing. So I wonder what is the rational behind decision to not
include docker binary into the mesosphere container and how do other people
solve this problem.
Here is one solution that I found. I use* docker:dind* but not as running
container but rather as a volume:
==============================
docker create --name "docker-proxy" -v
/var/run/docker.sock:/var/run/docker.sock -v /usr/local/bin docker:dind
===============================
This container contains a fully functional docker binary in its
/usr/local/bin, and this is all I need it for. To make the mesos-slave
container see this binary I simply use *--volumes-from* option:
==========
docker run -d --restart=unless-stopped --volumes-from
"docker-proxy" --docker=/usr/local/bin/docker
--containerizers="docker,mesos" --name
$MESOS_SLAVE $MESOS_SLAVE_IMAGE ...
==========
This works like a charm. But, there is the following problem.
In order for mesos-slave to function in this mode, it needs to spawn
executors in docker container as well. For that purpose mesos slave has
option *--docker_mesos_image= *that should be set to the same container
image name that's used to launch mesos slave. If I do this,
--docker_mesos_image="$MESOS_SLAVE_IMAGE"
I see that every attempt to spawn a task fails because option
*--docker=/usr/local/bin/docker* is apparently injected into the executor
container but the *--volumes-from="docker-proxy"* option is NOT! So, the
executor becomes dysfunctional without that docker binary.
So, to summarize, I'm raising 2 questions:
1. What is the best method to point mesos-slave running in a container to a
working copy of docker binary and make this work such that executor
containers will also inherit visibility of this binary.
2. If my proposed method based on docker:dind is deemed reasonable in
general, then I wonder whether I should file a Jira to request that in
addition to *--docker_mesos_image* one gets the ability to add additional
settings to the executor container such as *--volumes-from*. This is not
easy to formulate as potentially other similar options may need to be
configured as well.
P.S
The full script showing how I launch mesos slave is shown below
for i in ${MESOS_SLAVE_NODES[*]}; do
eval $(docker-machine env $i)
NODE_IP=$(docker-machine ip $i)
# mesos-slave requires access to docker binary, but the coctainer image
does not contain it.
# For that reason I'm creating (but not running!) a docker-in-a-docker
container which contains a statically linked version of the docker binary
# in /usr/local/bin. Then, using '--volumes-from' option on the mesos
container I'm making this binary visible
remove_container "docker-proxy"
docker create --name "docker-proxy" -v
/var/run/docker.sock:/var/run/docker.sock -v /usr/local/bin docker:dind
remove_container $MESOS_SLAVE
log "Starting mesos slave on $i"
docker run -d --restart=unless-stopped --volumes-from "docker-proxy" --name
$MESOS_SLAVE \
--net='host' \
--pid='host' \
-e "TZ=$TIMEZONE" \
--privileged \
-v /sys/fs/cgroup:/host/sys/fs/cgroup \
$MESOS_SLAVE_IMAGE \
--master="zk://$zk/mesos" \
--advertise_ip=$NODE_IP \
--ip=$NODE_IP \
--resources="ports:[8000-9000, 3000-3200]" \
--cgroups_hierarchy=/host/sys/fs/cgroup \
--docker=/usr/local/bin/docker \
--containerizers="docker,mesos" \
--log_dir=/var/log/mesos \
--logging_level=INFO \
--docker_remove_delay=1hrs \
--gc_delay=2hrs \
--executor_registration_timeout=5mins
# --docker_mesos_image="$MESOS_SLAVE_IMAGE" \
done