ugh. Thanks! I knew this was an issue, and completely ignored the fact that someone changed the nameā¦
Thanks - works fine now. Tyson On Jun 19, 2015, at 3:39 PM, Brian Devins <[email protected]<mailto:[email protected]>> wrote: You can't name the container mesos-slave. The slave currently sees all containers prefixed with 'mesos-' as one it is supposed to administer so it is killing itself off since it doesn't match a task that should be running. On Fri, Jun 19, 2015 at 6:35 PM, Tyson Norris <[email protected]<mailto:[email protected]>> wrote: Hi - Sorry for the delay, just getting back to this. Below is the command and stdout I get. I tried specifying just mesos containerizes, as this person mentioned https://github.com/mesosphere/coreos-setup/issues/5 and had similar results - works fine with mesos containerized, but not docker. Also I am only seeing this failing on a RHEL 7 + docker containerizer, works fine on ubuntu 14 with docker containerizer. Thanks Tyson [root@phx-8 ~]# docker run --rm -it \ > --name mesos-slave \ > --net host \ > --pid host \ > --privileged \ > --env MESOS_CONTAINERIZERS=docker \ > --env MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins \ > --env MESOS_HOSTNAME=192.168.8.8 \ > --env MESOS_IP=192.168.8.8 \ > --env MESOS_LOG_DIR=/var/log/mesos \ > --env MESOS_LOGGING_LEVEL=INFO \ > --env > MESOS_MASTER=zk://zk1.service.consul:2181,zk2.service.consul:2181,zk3.service.consul:2181/mesos > \ > --env SERVICE_5051_NAME=mesos-slave \ > --env > MESOS_DOCKER_MESOS_IMAGE=docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404<http://docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404> > \ > --env GLOG_v=1 \ > --volume /var/run/docker.sock:/var/run/docker.sock \ > --volume /sys:/sys:ro \ > -p 0.0.0.0:5051:5051 \ > --entrypoint mesos-slave \ > > docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404<http://docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404> WARNING: Logging before InitGoogleLogging() is written to STDERR I0619 22:32:37.081535 8362 process.cpp:961] libprocess is initialized on 192.168.8.8:5051<http://192.168.8.8:5051/> for 8 cpus I0619 22:32:37.160727 8362 logging.cpp:172] INFO level logging started! I0619 22:32:37.161571 8362 logging.cpp:177] Logging to /var/log/mesos I0619 22:32:37.161609 8362 main.cpp:156] Build: 2015-05-05 06:15:50 by root I0619 22:32:37.161629 8362 main.cpp:158] Version: 0.22.1 I0619 22:32:37.161639 8362 main.cpp:161] Git tag: 0.22.1 I0619 22:32:37.161650 8362 main.cpp:165] Git SHA: d6309f92a7f9af3ab61a878403e3d9c284ea87e0 2015-06-19 22:32:40,196:8362(0x7f38e66b0700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-06-19 22:32:40,196:8362(0x7f38e66b0700):ZOO_INFO@log_env@716: Client environment:host.name<http://host.name/>=phx-8.corp.adobe.com<http://phx-8.corp.adobe.com/> 2015-06-19 22:32:40,196:8362(0x7f38e66b0700):ZOO_INFO@log_env@723: Client environment:os.name<http://os.name/>=Linux 2015-06-19 22:32:40,196:8362(0x7f38e66b0700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-123.el7.x86_64 2015-06-19 22:32:40,196:8362(0x7f38e66b0700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Mon May 5 11:16:57 EDT 2014 I0619 22:32:40.196564 8362 main.cpp:200] Starting Mesos slave I0619 22:32:40.203459 8362 slave.cpp:174] Slave started on 1)@192.168.8.8:5051<http://192.168.8.8:5051/> I0619 22:32:40.205621 8362 slave.cpp:322] Slave resources: cpus(*):4; mem(*):14864; disk(*):4975; ports(*):[31000-32000] I0619 22:32:40.206074 8362 slave.cpp:351] Slave hostname: 192.168.8.8 I0619 22:32:40.206116 8362 slave.cpp:352] Slave checkpoint: true 2015-06-19 22:32:40,208:8362(0x7f38e66b0700):ZOO_INFO@log_env@733: Client environment:user.name<http://user.name/>=(null) 2015-06-19 22:32:40,208:8362(0x7f38e66b0700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-06-19 22:32:40,208:8362(0x7f38e66b0700):ZOO_INFO@log_env@753: Client environment:user.dir=/ 2015-06-19 22:32:40,208:8362(0x7f38e66b0700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=zk1.service.consul:2181,zk2.service.consul:2181,zk3.service.consul:2181 sessionTimeout=10000 watcher=0x7f38ea110a60 sessionId=0 sessionPasswd=<null> context=0x7f38d4000ea0 flags=0 I0619 22:32:40.208984 8367 state.cpp:35] Recovering state from '/tmp/mesos/meta' I0619 22:32:40.209102 8367 slave.cpp:600] Successfully attached file '/var/log/mesos/mesos-slave.INFO' I0619 22:32:40.209174 8367 status_update_manager.cpp:197] Recovering status update manager I0619 22:32:40.230962 8367 docker.cpp:423] Recovering Docker containers I0619 22:32:40.231061 8367 docker.cpp:697] Running docker ps -a 2015-06-19 22:32:40,252:8362(0x7f38e1a4b700):ZOO_INFO@check_events@1703: initiated connection to server [192.168.8.3:2181<http://192.168.8.3:2181/>] 2015-06-19 22:32:40,256:8362(0x7f38e1a4b700):ZOO_INFO@check_events@1750: session establishment complete on server [192.168.8.3:2181<http://192.168.8.3:2181/>], sessionId=0x24e031235050013, negotiated timeout=10000 I0619 22:32:40.256567 8367 group.cpp:313] Group process (group(1)@192.168.8.8:5051<http://192.168.8.8:5051/>) connected to ZooKeeper I0619 22:32:40.256647 8367 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0619 22:32:40.256671 8367 group.cpp:385] Trying to create path '/mesos' in ZooKeeper I0619 22:32:40.258224 8367 detector.cpp:138] Detected a new leader: (id='2') I0619 22:32:40.258386 8367 group.cpp:659] Trying to get '/mesos/info_0000000002' in ZooKeeper I0619 22:32:40.259188 8367 detector.cpp:452] A new leading master ([email protected]:5050<http://[email protected]:5050/>) is detected I0619 22:32:40.294416 8372 docker.cpp:568] Running docker inspect mesos-slave I0619 22:32:40.295127 8372 docker.cpp:568] Running docker inspect mesos-4c41646b-c348-4af5-81cd-dbfe9429e3db I0619 22:32:40.295805 8372 docker.cpp:568] Running docker inspect mesos-0d6af11a-139d-40d1-972e-bd608d0c73c7 I0619 22:32:40.296314 8372 docker.cpp:568] Running docker inspect mesos-cce55a3d-cf70-4f75-b6cd-25148f23c0d8 I0619 22:32:40.296826 8372 docker.cpp:568] Running docker inspect mesos-45aa3139-1c93-44e8-8704-74085f9e550b I0619 22:32:40.297291 8372 docker.cpp:568] Running docker inspect mesos-45a304e5-f84f-43ee-8e4e-fc3f6f50f7f7 I0619 22:32:40.401979 8373 docker.cpp:558] Checking if Docker container named '/mesos-slave' was started by Mesos I0619 22:32:40.402067 8373 docker.cpp:568] Checking if Mesos container with ID 'slave' has been orphaned I0619 22:32:40.402096 8373 docker.cpp:502] Running docker stop -t 0 6cea94f0bf3c91adb5c0e73b5a77ffa60a4276ef0a473c3622fb71b156fae5da I0619 22:32:40.402730 8373 docker.cpp:558] Checking if Docker container named '/mesos-4c41646b-c348-4af5-81cd-dbfe9429e3db' was started by Mesos I0619 22:32:40.402941 8373 docker.cpp:568] Checking if Mesos container with ID '4c41646b-c348-4af5-81cd-dbfe9429e3db' has been orphaned I0619 22:32:40.402982 8373 docker.cpp:502] Running docker stop -t 0 75dc2d4f49b2dda88b23cbddbfe14088bef4342f35b6386babf7f8d647eae2ea I0619 22:32:40.403493 8373 docker.cpp:558] Checking if Docker container named '/mesos-0d6af11a-139d-40d1-972e-bd608d0c73c7' was started by Mesos I0619 22:32:40.403625 8373 docker.cpp:568] Checking if Mesos container with ID '0d6af11a-139d-40d1-972e-bd608d0c73c7' has been orphaned I0619 22:32:40.403647 8373 docker.cpp:502] Running docker stop -t 0 cf024fc2dd5f2bf69c68185d29aa2c6fcd0a54fb4f34e82600429256c8b2f8ad I0619 22:32:40.404161 8373 docker.cpp:558] Checking if Docker container named '/mesos-cce55a3d-cf70-4f75-b6cd-25148f23c0d8' was started by Mesos I0619 22:32:40.404297 8373 docker.cpp:568] Checking if Mesos container with ID 'cce55a3d-cf70-4f75-b6cd-25148f23c0d8' has been orphaned I0619 22:32:40.404327 8373 docker.cpp:502] Running docker stop -t 0 ddd4dad5289ea5ec198ffc6255df72a684d4c50a98e1b9d5d23ffb419a8a6751 I0619 22:32:40.406074 8373 docker.cpp:558] Checking if Docker container named '/mesos-45aa3139-1c93-44e8-8704-74085f9e550b' was started by Mesos I0619 22:32:40.429641 8373 docker.cpp:568] Checking if Mesos container with ID '45aa3139-1c93-44e8-8704-74085f9e550b' has been orphaned I0619 22:32:40.431550 8373 docker.cpp:502] Running docker stop -t 0 484fc48259db5671accd3d83f4507c1b438943bbb9dc341378b5eed36c2720de I0619 22:32:40.434613 8373 docker.cpp:558] Checking if Docker container named '/mesos-45a304e5-f84f-43ee-8e4e-fc3f6f50f7f7' was started by Mesos I0619 22:32:40.446907 8373 docker.cpp:568] Checking if Mesos container with ID '45a304e5-f84f-43ee-8e4e-fc3f6f50f7f7' has been orphaned I0619 22:32:40.448941 8373 docker.cpp:502] Running docker stop -t 0 b7b0aec5eb15476c6e45d28e9b866add62373c7b116f52d6caffa8e55a03385e I0619 22:32:40.451561 8373 slave.cpp:3808] Finished recovery I0619 22:32:40.453068 8373 slave.cpp:647] New master detected at [email protected]:5050<http://[email protected]:5050/> I0619 22:32:40.453143 8372 status_update_manager.cpp:171] Pausing sending status updates I0619 22:32:40.454219 8373 slave.cpp:672] No credentials provided. Attempting to register without authentication I0619 22:32:40.461662 8373 slave.cpp:683] Detecting new master W0619 22:32:40.461662 8362 logging.cpp:81] RAW: Received signal SIGTERM from process 1162 of user 0; exiting [root@phx-8 ~]# On Jun 13, 2015, at 11:49 PM, Timothy Chen <[email protected]<mailto:[email protected]>> wrote: Can you enable GLOG_v=1 env variable when launching the slave and post the slave log? Tim On Jun 13, 2015, at 12:44 AM, Tyson Norris <[email protected]<mailto:[email protected]>> wrote: Hi - We are running mesos slave (0.22.0-1.0.ubuntu1404) in a docker container with docker containerizer without problems on ubuntu 14.04 docker host (with lxc-docker pkg etc added). Running the same slave container on RHEL 7.0 docker host, the container exits almost immediately after starting with: I0613 07:18:15.161931 5303 slave.cpp:3808] Finished recovery I0613 07:18:15.162677 5303 slave.cpp:647] New master detected at [email protected]<mailto:[email protected]>:5050 I0613 07:18:15.162753 5301 status_update_manager.cpp:171] Pausing sending status updates I0613 07:18:15.163051 5303 slave.cpp:672] No credentials provided. Attempting to register without authentication I0613 07:18:15.163734 5303 slave.cpp:683] Detecting new master W0613 07:18:15.163734 5293 logging.cpp:81] RAW: Received signal SIGTERM from process 1166 of user 0; exiting If I do not enable the docker containerized, the slave container runs fine. Other containers that bind mount /var/run/docker.sock also run fine. Debug docker logs are below. One difference between the ubuntu docker host and RHEL docker host is that the unbuntu host uses the aufs driver, while rhel uses devicemapper, and selinux is enabled in RHEL but not ubuntu. Thanks for any advice! Tyson Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="POST /v1.18/containers/9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd/start" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job start(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="activateDeviceIfNeeded(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): ioctl/libdm-iface.c:1750 (4) dm info docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd OF [16384] (*1)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): ioctl/libdm-iface.c:1750 (4) dm create docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd OF [16384] (*1)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): libdm-common.c:1348 (4) docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd: Stacking NODE_ADD (253,9) 0:0 0600 [verify_udev]" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): ioctl/libdm-iface.c:1750 (4) dm reload docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd OF [16384] (*1)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): ioctl/libdm-iface.c:1750 (4) dm resume docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd OF [16384] (*1)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): libdm-common.c:1348 (4) docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd: Processing NODE_ADD (253,9) 0:0 0600 [verify_udev]" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="libdevmapper(6): libdm-common.c:983 (4) Created /dev/mapper/docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd" Jun 13 07:28:26 phx-8 kernel: EXT4-fs (dm-9): mounted filesystem with ordered data mode. Opts: discard Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job log(start, 9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd, docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404<http://docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404>)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job log(start, 9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd, docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404<http://docker.corp.adobe.com/tnorris/mesosslave:0.22.1-1.0.ubuntu1404>) = OK (0)" Jun 13 07:28:26 phx-8 systemd-udevd: conflicting device node '/dev/mapper/docker-253:3-16818501-9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd' found, link to '/dev/dm-9' will not be created Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling GET /containers/{name:.*}/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="GET /containers/9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job container_inspect(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd)" Jun 13 07:28:26 phx-8 systemd: Starting docker container 9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd. Jun 13 07:28:26 phx-8 systemd: Started docker container 9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd. Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job start(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd) = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job container_inspect(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd) = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling POST /containers/{name:.*}/resize" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="POST /v1.18/containers/9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd/resize?h=58&w=204" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job resize(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd, 58, 204)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job resize(9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd, 58, 204) = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling GET /version" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="GET /v1.18/version" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job version()" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job version() = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling GET /containers/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="GET /v1.18/containers/json?all=1" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job containers()" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job containers() = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling GET /containers/{name:.*}/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="GET /v1.18/containers/mesos-slave/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job container_inspect(mesos-slave)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job container_inspect(mesos-slave) = OK (0)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=debug msg="Calling GET /containers/{name:.*}/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="GET /v1.18/containers/mesos-slave2/json" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="+job container_inspect(mesos-slave2)" Jun 13 07:28:26 phx-8 docker: time="2015-06-13T07:28:26Z" level=info msg="-job container_inspect(mesos-slave2) = OK (0)" Jun 13 07:28:27 phx-8 docker: time="2015-06-13T07:28:27Z" level=debug msg="Calling POST /containers/{name:.*}/stop" Jun 13 07:28:27 phx-8 docker: time="2015-06-13T07:28:27Z" level=info msg="POST /v1.18/containers/9e897d0fd156dab5ec59f8ded2a6cdf7dc5379664c872cd7da4875b6aab9dfcd/stop?t=0"

