Hello, I'm running Docker 1.7.1 with Aurora 0.7.1 and Mesos 0.21.1. The Mesos worker nodes are running CentOS 6.7.
Aurora tasks using Docker seem to hang at the "Assigning" state. This includes the Aurora/Docker hello_docker example at: https://github.com/apache/aurora/blob/master/examples/jobs/docker/hello_docker.aurora In the case of hello_docker, the Mesos worker's log contains this: E1113 17:01:15.017153 6961 slave.cpp:2787] Container 'c9dffa4d-b798-4645-b41d-168038731419' for exe cutor 'thermos-1447434018972-produser-staging-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe' o f framework '20151013-012854-470221066-5050-16901-0000' failed to start: Unable to get executor pid after launch I1113 17:01:24.489450 6952 slave.cpp:3278] Terminating executor thermos-1447434018972-produser-stag ing-hello_docker-0-d5911c10-eb0a-4a6c-afda-b21793dac0fe of framework 20151013-012854-470221066-5050- 16901-0000 because it did not register within 1mins The Docker daemon logs contain the following: time="2015-11-09T22:09:56.323170747Z" level=info msg="Daemon has completed initialization" time="2015-11-09T22:09:56.323195633Z" level=info msg="Docker daemon" commit="786b29d/1.7.1" execdriver=native-0.2 graphdriver=devicemapper version=1.7.1 time="2015-11-09T22:13:05.341776364Z" level=info msg="GET /v1.19/version" time="2015-11-09T22:13:05.516699260Z" level=info msg="GET /v1.19/containers/json?all=1" time="2015-11-09T22:17:47.268669042Z" level=info msg="GET /v1.19/version" time="2015-11-09T22:17:47.449515720Z" level=info msg="GET /v1.19/containers/json?all=1" time="2015-11-13T17:00:24.788707886Z" level=info msg="GET /v1.19/containers/python:2.7/json" time="2015-11-13T17:00:24.788934073Z" level=error msg="Handler for GET /containers/{name:.*}/json returned error: no such id: python:2.7" time="2015-11-13T17:00:24.788964668Z" level=error msg="HTTP Error" err="no such id: python:2.7" statusCode=404 time="2015-11-13T17:00:24.789224280Z" level=info msg="GET /v1.19/images/python:2.7/json" time="2015-11-13T17:00:24.789331687Z" level=error msg="Handler for GET /images/{name:.*}/json returned error: No such image: python:2.7" time="2015-11-13T17:00:24.789363567Z" level=error msg="HTTP Error" err="No such image: python:2.7" statusCode=404 time="2015-11-13T17:00:24.883303396Z" level=info msg="POST /v1.19/images/create?fromImage=python%3A2.7" time="2015-11-13T17:00:25.821615185Z" level=info msg="Image manifest for python:2.7 has been verified" time="2015-11-13T17:01:14.149038177Z" level=info msg="GET /v1.19/containers/python:2.7/json" time="2015-11-13T17:01:14.149309024Z" level=error msg="Handler for GET /containers/{name:.*}/json returned error: no such id: python:2.7" time="2015-11-13T17:01:14.149356429Z" level=error msg="HTTP Error" err="no such id: python:2.7" statusCode=404 time="2015-11-13T17:01:14.149663598Z" level=info msg="GET /v1.19/images/python:2.7/json" time="2015-11-13T17:01:14.255028943Z" level=info msg="POST /v1.19/containers/create?name=mesos-c9dffa4d-b798-4645-b41d-168038731419" time="2015-11-13T17:01:14.562814875Z" level=info msg="POST /v1.19/containers/248e89eb41ab0470c44da5c11c331bf950084c32649d5454d04bc8a9aa50eea7/start" time="2015-11-13T17:01:14.891568210Z" level=info msg="GET /v1.19/containers/mesos-c9dffa4d-b798-4645-b41d-168038731419/json" `docker pull python:2.7` succeeds. Running Docker containers on the worker directly without Aurora/Mesos works fine. It seems to be related to not having Thermos execute, but using the nginx container from here also fails similarly: https://github.com/livewyer-ops/aurora-web-containers/tree/master/Dockerfiles (and it seemed to be designed to include the right deps). Am I missing something obvious? Are there any additional logs that I should be examining? Is there a canonical list of Mesos/Thermos/Aurora deps that need to be in a container for it to work with Aurora? And shouldn't the hello_docker example satisfy them? -Tom
