I've been trying out the docker-integration with mesos & marathon since the bridged networking has been added and I've run into a couple of issues - the most disturbing seems to be allocating of already in use ports (I suspect this may be a marathon issue) and the failure to recover the tasks once this occurs.
What I am running is a very simple setup, driven locally from vagrant. I attempt to run the python3 container specified here under Bridged Networking (https://mesosphere.github.io/marathon/docs/native-docker.html). What I see is that, whilst the container is being pulled for the first time every task exists as KILLED. Once the image has been pulled, the container starts but mesos does not realise this - causing it to fail to start additional containers with port allocation conflicts. Killing the unrecognised container in docker will unblock mesos to start up the containers. Now, once this is started, if I attempt to scale the number of instances up in marathon, I see in the UI that it attempts to start another container (a third in my case, two slaves) with the same port allocations that are already in use on the slave. This is the error in the slave logs: E1005 10:41:01.812988 2883 slave.cpp:2485] Container '05cf52f1-b915-45e5-9071-6b46fda3b71c' for executor 'bridged-webapp.18747ba3-4c7c-11e4-9567-080027100ea3' of framework '20141005-083953-159390892-5050-9177-0000' failed to start: Failed to 'docker run -d -c 512 -m 67108864 -e PORT=31000 -e PORT0=31000 -e PORTS=31000,31001 -e PORT1=31001 -e MESOS_SANDBOX=/mnt/mesos/sandbox -v /tmp/mesos/slaves/20141005-101854-159390892-5050-1326-0/frameworks/20141005-083953-159390892-5050-9177-0000/executors/bridged-webapp.18747ba3-4c7c-11e4-9567-080027100ea3/runs/05cf52f1-b915-45e5-9071-6b46fda3b71c:/mnt/mesos/sandbox --net bridge -p 31000:8080/tcp -p 31001:161/udp --entrypoint /bin/sh --name mesos-05cf52f1-b915-45e5-9071-6b46fda3b71c python:3 -c python3 -m http.server 8080': exit status = exited with status 1 stderr = WARNING: Your kernel does not support swap limit capabilities. Limitation discarded. 2014/10/05 10:41:01 Error response from daemon: Cannot start container b2516e3356ca1cf3163f6926249b4e936ec9afe4549ee37f4a9d5df62dbbaf1b: Bind for 0.0.0.0:31000 failed: port is already allocated There is nothing in the stderr or stdout of the task. I have setup the slaves according to the docs (set the containerizers and the timeout) - any help here would be appreciated. Cheers, Ryan

