Are there any additional logs that I should be examining? Yes. The stderr/stdout of the executor lives in the task sandbox, which you'll find in the mesos agent's log (it includes the task ID). I suspect you'll find an error similar to what's mentioned in the ticket i linked below.
> Is there a canonical list of Mesos/Thermos/Aurora deps that need to be > in a container for it to work with Aurora? And shouldn't the > hello_docker example satisfy them? Unfortunately mesos has made this a bit of a moving target - the dependencies of libmesos.so have grown over time. I have a ticket tracking this specific issue: https://issues.apache.org/jira/browse/AURORA-1487 I will be dedicating effort to making the docker-aurora story much better over the next few releases. On Fri, Nov 13, 2015 at 11:05 AM, Thomas Dyas <[email protected]> wrote: > Hello, > > I'm running Docker 1.7.1 with Aurora 0.7.1 and Mesos 0.21.1. The Mesos > worker nodes are running CentOS 6.7. > > Aurora tasks using Docker seem to hang at the "Assigning" state. This > includes the Aurora/Docker hello_docker example at: > > https://github.com/apache/aurora/blob/master/examples/jobs/docker/hello_docker.aurora > > In the case of hello_docker, the Mesos worker's log contains this: > > E1113 17:01:15.017153 6961 slave.cpp:2787] Container > 'c9dffa4d-b798-4645-b41d-168038731419' for exe > cutor > 'thermos-1447434018972-produser-staging-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe' > o > f framework '20151013-012854-470221066-5050-16901-0000' failed to > start: Unable to get executor pid > after launch > I1113 17:01:24.489450 6952 slave.cpp:3278] Terminating executor > thermos-1447434018972-produser-stag > ing-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe of framework > 20151013-012854-470221066-5050- > 16901-0000 because it did not register within 1mins > > The Docker daemon logs contain the following: > > time="2015-11-09T22:09:56.323170747Z" level=info msg="Daemon has > completed initialization" > time="2015-11-09T22:09:56.323195633Z" level=info msg="Docker daemon" > commit="786b29d/1.7.1" execdriver=native-0.2 graphdriver=devicemapper > version=1.7.1 > time="2015-11-09T22:13:05.341776364Z" level=info msg="GET /v1.19/version" > time="2015-11-09T22:13:05.516699260Z" level=info msg="GET > /v1.19/containers/json?all=1" > time="2015-11-09T22:17:47.268669042Z" level=info msg="GET /v1.19/version" > time="2015-11-09T22:17:47.449515720Z" level=info msg="GET > /v1.19/containers/json?all=1" > time="2015-11-13T17:00:24.788707886Z" level=info msg="GET > /v1.19/containers/python:2.7/json" > time="2015-11-13T17:00:24.788934073Z" level=error msg="Handler for GET > /containers/{name:.*}/json returned error: no such id: python:2.7" > time="2015-11-13T17:00:24.788964668Z" level=error msg="HTTP Error" > err="no such id: python:2.7" statusCode=404 > time="2015-11-13T17:00:24.789224280Z" level=info msg="GET > /v1.19/images/python:2.7/json" > time="2015-11-13T17:00:24.789331687Z" level=error msg="Handler for GET > /images/{name:.*}/json returned error: No such image: python:2.7" > time="2015-11-13T17:00:24.789363567Z" level=error msg="HTTP Error" > err="No such image: python:2.7" statusCode=404 > time="2015-11-13T17:00:24.883303396Z" level=info msg="POST > /v1.19/images/create?fromImage=python%3A2.7" > time="2015-11-13T17:00:25.821615185Z" level=info msg="Image manifest > for python:2.7 has been verified" > time="2015-11-13T17:01:14.149038177Z" level=info msg="GET > /v1.19/containers/python:2.7/json" > time="2015-11-13T17:01:14.149309024Z" level=error msg="Handler for GET > /containers/{name:.*}/json returned error: no such id: python:2.7" > time="2015-11-13T17:01:14.149356429Z" level=error msg="HTTP Error" > err="no such id: python:2.7" statusCode=404 > time="2015-11-13T17:01:14.149663598Z" level=info msg="GET > /v1.19/images/python:2.7/json" > time="2015-11-13T17:01:14.255028943Z" level=info msg="POST > /v1.19/containers/create?name=mesos-c9dffa4d-b798-4645-b41d-168038731419" > time="2015-11-13T17:01:14.562814875Z" level=info msg="POST > > /v1.19/containers/248e89eb41ab0470c44da5c11c331bf950084c32649d5454d04bc8a9aa50eea7/start" > time="2015-11-13T17:01:14.891568210Z" level=info msg="GET > /v1.19/containers/mesos-c9dffa4d-b798-4645-b41d-168038731419/json" > > `docker pull python:2.7` succeeds. Running Docker containers on the > worker directly without Aurora/Mesos works fine. > > It seems to be related to not having Thermos execute, but using the > nginx container from here also fails similarly: > > https://github.com/livewyer-ops/aurora-web-containers/tree/master/Dockerfiles > (and it seemed to be designed to include the right deps). > > Am I missing something obvious? Are there any additional logs that I > should be examining? > > Is there a canonical list of Mesos/Thermos/Aurora deps that need to be > in a container for it to work with Aurora? And shouldn't the > hello_docker example satisfy them? > > -Tom >
