Sorry for that, here is the direct link: http://mail-archives.apache.org/mod_mbox/mesos-user/201410.mbox/%3CCALnA0DwbqJdEm3at7SonsXmCwxnN3%3DCUrwgoPBHXJDoOFyJjig%40mail.gmail.com%3E
On Thu, Oct 23, 2014 at 10:49 AM, Nils De Moor <[email protected]> wrote: > I had the same issue as you did, here is how I fixed it: > http://mail-archives.apache.org/mod_mbox/mesos-user/201410.mbox/browser > > On Thu, Oct 23, 2014 at 1:42 AM, Connor Doyle <[email protected]> > wrote: > >> Hi Eduardo, >> >> There is a known defect in Mesos that matches your description: >> https://issues.apache.org/jira/browse/MESOS-1915 >> https://issues.apache.org/jira/browse/MESOS-1884 >> >> A fix will be included in the next release. >> https://reviews.apache.org/r/26486 >> >> You see the killTask because the default --task_launch_timeout value for >> Marathon is 60 seconds. >> Created an issue to make the logging around this better: >> https://github.com/mesosphere/marathon/issues/732 >> >> -- >> Connor >> >> >> On Oct 22, 2014, at 16:18, Eduardo Jiménez <[email protected]> wrote: >> >> > Hi, >> > >> > I've started experimenting with mesos using the docker containerizer, >> and running a simple example got into a very strange state. >> > >> > I have mesos-0.20.1, marathon-0.7 setup on EC2, using Amazon Linux: >> > >> > Linux <ip> 3.14.20-20.44.amzn1.x86_64 #1 SMP Mon Oct 6 22:52:46 UTC >> 2014 x86_64 x86_64 x86_64 GNU/Linux >> > >> > Docker version 1.2.0, build fa7b24f/1.2.0 >> > >> > I start the mesos slave with these relevant options: >> > >> > --cgroups_hierarchy=/cgroup >> > --containerizers=docker,mesos >> > --executor_registration_timeout=5mins >> > --isolation=cgroups/cpu,cgroups/mem >> > >> > I launched a very simple app, which is from the mesosphere examples: >> > >> > { >> > "container": { >> > "type": "DOCKER", >> > "docker": { >> > "image": "libmesos/ubuntu" >> > } >> > }, >> > "id": "ubuntu-docker2", >> > "instances": "1", >> > "cpus": "0.5", >> > "mem": "512", >> > "uris": [], >> > "cmd": "while sleep 10; do date -u +%T; done" >> > } >> > >> > The app launches, but then mesos states the task is KILLED, yet the >> docker container is STILL running. Here's the sequence of logs from that >> mesos-slave. >> > >> > 1) Task gets created and assigned: >> > >> > I1022 17:44:13.971096 15195 slave.cpp:1002] Got assigned task >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework >> 20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:44:13.971367 15195 slave.cpp:1112] Launching task >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework >> 20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:44:13.973047 15195 slave.cpp:1222] Queuing task >> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' for executor >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> '20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:44:13.989893 15195 docker.cpp:743] Starting container >> 'c1fc27c8-13e9-484f-a30c-cb062ec4c978' for task >> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' (and executor >> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799') of framework >> '20141017-172055-3489660938-5050-1603-0000' >> > >> > So far so good. The log statements right next to "Starting container" >> is: >> > >> > I1022 17:45:14.893309 15196 slave.cpp:1278] Asked to kill task >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:45:14.894579 15196 slave.cpp:2088] Handling status update >> TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for task >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 from @0.0.0.0:0 >> > W1022 17:45:14.894798 15196 slave.cpp:1354] Killing the unregistered >> executor 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' of framework >> 20141017-172055-3489660938-5050-1603-0000 because it has no tasks >> > E1022 17:45:14.925014 15192 slave.cpp:2205] Failed to update resources >> for container c1fc27c8-13e9-484f-a30c-cb062ec4c978 of executor >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 running task >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 on status update for >> terminal task, destroying container: No container found >> > >> > After this, there's several log messages like this: >> > >> > I1022 17:45:14.926197 15194 status_update_manager.cpp:320] Received >> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for >> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:45:14.926378 15194 status_update_manager.cpp:373] Forwarding >> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for >> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 to [email protected]:5050 >> > W1022 17:45:16.169214 15196 status_update_manager.cpp:181] Resending >> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for >> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 >> > I1022 17:45:16.169275 15196 status_update_manager.cpp:373] Forwarding >> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for >> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework >> 20141017-172055-3489660938-5050-1603-0000 to [email protected]:5050 >> > >> > >> > Eventually the TASK_KILLED update is acked and the Mesos UI shows the >> task as killed. By then, the process should be dead, but its not. >> > >> > $ sudo docker ps >> > CONTAINER ID IMAGE COMMAND >> CREATED STATUS PORTS NAMES >> > f76784e1af8b libmesos/ubuntu:latest "/bin/sh -c 'while s 5 >> hours ago Up 5 hours >> mesos-c1fc27c8-13e9-484f-a30c-cb062ec4c978 >> > >> > >> > The container shows in the UI like this: >> > >> > ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 >> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 KILLED 5 hours >> ago 5 hours ago >> > And its been running the whole time. >> > >> > There's no other logging indicating why killTask was invoked, which >> makes this extremely frustrating to debug. >> > >> > Has anyone seen something similar? >> > >> > Thanks, >> > >> > Eduardo >> > >> >> >

