Hi there - I'm working on setting up a Mesos environment with the
Docker containerizer and can't seem to get the recovery feature
working. I'm running CoreOS, so the slave processes themselves are
containerized. I have no issues running jobs without the recovery
features enabled, but all jobs fail to boot when I add the following
flags:

    MESOS_DOCKER_KILL_ORPHANS=false
    MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

Inspecting the Docker images and their log output reveals that the
container invocation appears to be flawed - see this gist:

https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

The containerizer is attempting to invoke an unquoted command via
`/bin/sh -c`, which, predictably, fails to pass the complete command.
This results in the error message shown in the second file in the
linked gist.

This is reproducible manually; quoting the arguments to `/bin/sh -c`
results in success (at least, it correctly receives the supplied
arguments).

I gather that this is related to MESOS-2115, and it's clear that this
patch[1] changed that behavior significantly, but if it introduced a
bug I can't see it. It's possible that my instance is configured
incorrectly as well; the documentation here is a bit vague and there
aren't many examples on the web.

Thanks in advance,
--
b

[1]: 
https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Reply via email to