Are there any additional logs that I should be examining?

Yes.  The stderr/stdout of the executor lives in the task sandbox, which
you'll find in the mesos agent's log (it includes the task ID).  I suspect
you'll find an error similar to what's mentioned in the ticket i linked
below.


> Is there a canonical list of Mesos/Thermos/Aurora deps that need to be
> in a container for it to work with Aurora? And shouldn't the
> hello_docker example satisfy them?


Unfortunately mesos has made this a bit of a moving target - the
dependencies of libmesos.so have grown over time.  I have a ticket tracking
this specific issue: https://issues.apache.org/jira/browse/AURORA-1487

I will be dedicating effort to making the docker-aurora story much better
over the next few releases.


On Fri, Nov 13, 2015 at 11:05 AM, Thomas Dyas <[email protected]> wrote:

> Hello,
>
> I'm running Docker 1.7.1 with Aurora 0.7.1 and Mesos 0.21.1. The Mesos
> worker nodes are running CentOS 6.7.
>
> Aurora tasks using Docker seem to hang at the "Assigning" state. This
> includes the Aurora/Docker hello_docker example at:
>
> https://github.com/apache/aurora/blob/master/examples/jobs/docker/hello_docker.aurora
>
> In the case of hello_docker, the Mesos worker's log contains this:
>
> E1113 17:01:15.017153  6961 slave.cpp:2787] Container
> 'c9dffa4d-b798-4645-b41d-168038731419' for exe
> cutor
> 'thermos-1447434018972-produser-staging-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe'
> o
> f framework '20151013-012854-470221066-5050-16901-0000' failed to
> start: Unable to get executor pid
> after launch
> I1113 17:01:24.489450  6952 slave.cpp:3278] Terminating executor
> thermos-1447434018972-produser-stag
> ing-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe of framework
> 20151013-012854-470221066-5050-
> 16901-0000 because it did not register within 1mins
>
> The Docker daemon logs contain the following:
>
> time="2015-11-09T22:09:56.323170747Z" level=info msg="Daemon has
> completed initialization"
> time="2015-11-09T22:09:56.323195633Z" level=info msg="Docker daemon"
> commit="786b29d/1.7.1" execdriver=native-0.2 graphdriver=devicemapper
> version=1.7.1
> time="2015-11-09T22:13:05.341776364Z" level=info msg="GET /v1.19/version"
> time="2015-11-09T22:13:05.516699260Z" level=info msg="GET
> /v1.19/containers/json?all=1"
> time="2015-11-09T22:17:47.268669042Z" level=info msg="GET /v1.19/version"
> time="2015-11-09T22:17:47.449515720Z" level=info msg="GET
> /v1.19/containers/json?all=1"
> time="2015-11-13T17:00:24.788707886Z" level=info msg="GET
> /v1.19/containers/python:2.7/json"
> time="2015-11-13T17:00:24.788934073Z" level=error msg="Handler for GET
> /containers/{name:.*}/json returned error: no such id: python:2.7"
> time="2015-11-13T17:00:24.788964668Z" level=error msg="HTTP Error"
> err="no such id: python:2.7" statusCode=404
> time="2015-11-13T17:00:24.789224280Z" level=info msg="GET
> /v1.19/images/python:2.7/json"
> time="2015-11-13T17:00:24.789331687Z" level=error msg="Handler for GET
> /images/{name:.*}/json returned error: No such image: python:2.7"
> time="2015-11-13T17:00:24.789363567Z" level=error msg="HTTP Error"
> err="No such image: python:2.7" statusCode=404
> time="2015-11-13T17:00:24.883303396Z" level=info msg="POST
> /v1.19/images/create?fromImage=python%3A2.7"
> time="2015-11-13T17:00:25.821615185Z" level=info msg="Image manifest
> for python:2.7 has been verified"
> time="2015-11-13T17:01:14.149038177Z" level=info msg="GET
> /v1.19/containers/python:2.7/json"
> time="2015-11-13T17:01:14.149309024Z" level=error msg="Handler for GET
> /containers/{name:.*}/json returned error: no such id: python:2.7"
> time="2015-11-13T17:01:14.149356429Z" level=error msg="HTTP Error"
> err="no such id: python:2.7" statusCode=404
> time="2015-11-13T17:01:14.149663598Z" level=info msg="GET
> /v1.19/images/python:2.7/json"
> time="2015-11-13T17:01:14.255028943Z" level=info msg="POST
> /v1.19/containers/create?name=mesos-c9dffa4d-b798-4645-b41d-168038731419"
> time="2015-11-13T17:01:14.562814875Z" level=info msg="POST
>
> /v1.19/containers/248e89eb41ab0470c44da5c11c331bf950084c32649d5454d04bc8a9aa50eea7/start"
> time="2015-11-13T17:01:14.891568210Z" level=info msg="GET
> /v1.19/containers/mesos-c9dffa4d-b798-4645-b41d-168038731419/json"
>
> `docker pull python:2.7` succeeds. Running Docker containers on the
> worker directly without Aurora/Mesos works fine.
>
> It seems to be related to not having Thermos execute, but using the
> nginx container from here also fails similarly:
>
> https://github.com/livewyer-ops/aurora-web-containers/tree/master/Dockerfiles
> (and it seemed to be designed to include the right deps).
>
> Am I missing something obvious? Are there any additional logs that I
> should be examining?
>
> Is there a canonical list of Mesos/Thermos/Aurora deps that need to be
> in a container for it to work with Aurora? And shouldn't the
> hello_docker example satisfy them?
>
> -Tom
>

Reply via email to