subject:"Troubles with slave recovery via Docker containerizer on 0.23.0"

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

2015-08-06 Thread Benjamin Anderson

Hi Tim,

That's the output from `docker inspect`. I've gisted the full contents
of the container's log file (in all of its JSON-encoded glory) here:

https://gist.githubusercontent.com/banjiewen/6450a06f958a2e7630bf/raw/12183fe891c1ddaf7019b478278c47c479d77c01/gistfile1.txt

The slave itself isn't logging much of interest, just various
Executor has terminated with unknown status messages, etc.

For context, my container is running 0.23.0 installed from packages on
Ubuntu 14.04. Docker is at 1.6.2.

--
b

On Wed, Aug 5, 2015 at 4:28 PM, Tim Chen t...@mesosphere.io wrote:
Hi Ben,

Did you get the command from docker inspect or from the slave log?

If it's from the slave log then we don't actually print out the exact way we
exec the command, but just joining the exec arguments with a space in
between.

What's the exact error in the slave/sandbox stderr log?

Tim

On Wed, Aug 5, 2015 at 4:18 PM, Benjamin Anderson
benja...@ivysoftworks.com wrote:

Hi there - I'm working on setting up a Mesos environment with the
Docker containerizer and can't seem to get the recovery feature
working. I'm running CoreOS, so the slave processes themselves are
containerized. I have no issues running jobs without the recovery
features enabled, but all jobs fail to boot when I add the following
flags:

MESOS_DOCKER_KILL_ORPHANS=false
MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

Inspecting the Docker images and their log output reveals that the
container invocation appears to be flawed - see this gist:

https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

The containerizer is attempting to invoke an unquoted command via
`/bin/sh -c`, which, predictably, fails to pass the complete command.
This results in the error message shown in the second file in the
linked gist.

This is reproducible manually; quoting the arguments to `/bin/sh -c`
results in success (at least, it correctly receives the supplied
arguments).

I gather that this is related to MESOS-2115, and it's clear that this
patch[1] changed that behavior significantly, but if it introduced a
bug I can't see it. It's possible that my instance is configured
incorrectly as well; the documentation here is a bit vague and there
aren't many examples on the web.

Thanks in advance,
--
b

[1]:
https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

2015-08-06 Thread Tim Chen

Got it, this shouldn't happen. Can you open a JIRA ticket? I'll try to
repro today.

Tim

On Thu, Aug 6, 2015 at 9:37 AM, Benjamin Anderson benja...@ivysoftworks.com
wrote:

Hi Tim,

That's the output from `docker inspect`. I've gisted the full contents
of the container's log file (in all of its JSON-encoded glory) here:

https://gist.githubusercontent.com/banjiewen/6450a06f958a2e7630bf/raw/12183fe891c1ddaf7019b478278c47c479d77c01/gistfile1.txt

The slave itself isn't logging much of interest, just various
Executor has terminated with unknown status messages, etc.

For context, my container is running 0.23.0 installed from packages on
Ubuntu 14.04. Docker is at 1.6.2.

--
b

On Wed, Aug 5, 2015 at 4:28 PM, Tim Chen t...@mesosphere.io wrote:
Hi Ben,

Did you get the command from docker inspect or from the slave log?

If it's from the slave log then we don't actually print out the exact
way we
exec the command, but just joining the exec arguments with a space in
between.

What's the exact error in the slave/sandbox stderr log?

Tim

On Wed, Aug 5, 2015 at 4:18 PM, Benjamin Anderson
benja...@ivysoftworks.com wrote:

MESOS_DOCKER_KILL_ORPHANS=false
MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

Inspecting the Docker images and their log output reveals that the
container invocation appears to be flawed - see this gist:

https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

This is reproducible manually; quoting the arguments to `/bin/sh -c`
results in success (at least, it correctly receives the supplied
arguments).

Thanks in advance,
--
b

[1]:

https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

2015-08-06 Thread Benjamin Anderson

Awesome, thanks Tim.

https://issues.apache.org/jira/browse/MESOS-3219

--
b

On Thu, Aug 6, 2015 at 10:02 AM, Tim Chen t...@mesosphere.io wrote:
Got it, this shouldn't happen. Can you open a JIRA ticket? I'll try to repro
today.

Tim

On Thu, Aug 6, 2015 at 9:37 AM, Benjamin Anderson
benja...@ivysoftworks.com wrote:

Hi Tim,

That's the output from `docker inspect`. I've gisted the full contents
of the container's log file (in all of its JSON-encoded glory) here:

https://gist.githubusercontent.com/banjiewen/6450a06f958a2e7630bf/raw/12183fe891c1ddaf7019b478278c47c479d77c01/gistfile1.txt

The slave itself isn't logging much of interest, just various
Executor has terminated with unknown status messages, etc.

For context, my container is running 0.23.0 installed from packages on
Ubuntu 14.04. Docker is at 1.6.2.

--
b

On Wed, Aug 5, 2015 at 4:28 PM, Tim Chen t...@mesosphere.io wrote:
Hi Ben,

Did you get the command from docker inspect or from the slave log?

If it's from the slave log then we don't actually print out the exact
way we
exec the command, but just joining the exec arguments with a space in
between.

What's the exact error in the slave/sandbox stderr log?

Tim

On Wed, Aug 5, 2015 at 4:18 PM, Benjamin Anderson
benja...@ivysoftworks.com wrote:

MESOS_DOCKER_KILL_ORPHANS=false
MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

Inspecting the Docker images and their log output reveals that the
container invocation appears to be flawed - see this gist:

https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

This is reproducible manually; quoting the arguments to `/bin/sh -c`
results in success (at least, it correctly receives the supplied
arguments).

Thanks in advance,
--
b

[1]:

https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Troubles with slave recovery via Docker containerizer on 0.23.0

2015-08-05 Thread Benjamin Anderson

Hi there - I'm working on setting up a Mesos environment with the
Docker containerizer and can't seem to get the recovery feature
working. I'm running CoreOS, so the slave processes themselves are
containerized. I have no issues running jobs without the recovery
features enabled, but all jobs fail to boot when I add the following
flags:

MESOS_DOCKER_KILL_ORPHANS=false
MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

Inspecting the Docker images and their log output reveals that the
container invocation appears to be flawed - see this gist:

https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

The containerizer is attempting to invoke an unquoted command via
`/bin/sh -c`, which, predictably, fails to pass the complete command.
This results in the error message shown in the second file in the
linked gist.

This is reproducible manually; quoting the arguments to `/bin/sh -c`
results in success (at least, it correctly receives the supplied
arguments).

I gather that this is related to MESOS-2115, and it's clear that this
patch[1] changed that behavior significantly, but if it introduced a
bug I can't see it. It's possible that my instance is configured
incorrectly as well; the documentation here is a bit vague and there
aren't many examples on the web.

Thanks in advance,
--
b

[1]: 
https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

2015-08-05 Thread Tim Chen

Hi Ben,

Did you get the command from docker inspect or from the slave log?

If it's from the slave log then we don't actually print out the exact way
we exec the command, but just joining the exec arguments with a space in
between.

What's the exact error in the slave/sandbox stderr log?

Tim


On Wed, Aug 5, 2015 at 4:18 PM, Benjamin Anderson benja...@ivysoftworks.com
 wrote:

 Hi there - I'm working on setting up a Mesos environment with the
 Docker containerizer and can't seem to get the recovery feature
 working. I'm running CoreOS, so the slave processes themselves are
 containerized. I have no issues running jobs without the recovery
 features enabled, but all jobs fail to boot when I add the following
 flags:

 MESOS_DOCKER_KILL_ORPHANS=false
 MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container

 Inspecting the Docker images and their log output reveals that the
 container invocation appears to be flawed - see this gist:

 https://gist.github.com/banjiewen/a2dc1784a82ed87edd6b

 The containerizer is attempting to invoke an unquoted command via
 `/bin/sh -c`, which, predictably, fails to pass the complete command.
 This results in the error message shown in the second file in the
 linked gist.

 This is reproducible manually; quoting the arguments to `/bin/sh -c`
 results in success (at least, it correctly receives the supplied
 arguments).

 I gather that this is related to MESOS-2115, and it's clear that this
 patch[1] changed that behavior significantly, but if it introduced a
 bug I can't see it. It's possible that my instance is configured
 incorrectly as well; the documentation here is a bit vague and there
 aren't many examples on the web.

 Thanks in advance,
 --
 b

 [1]:
 https://github.com/apache/mesos/commit/3baa60965407bf0c3eb9c3da1b2ba7c0a4fee968

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

Troubles with slave recovery via Docker containerizer on 0.23.0

Re: Troubles with slave recovery via Docker containerizer on 0.23.0

5 matches

Site Navigation

Mail list logo

Footer information