I had the same issue as you did, here is how I fixed it:
http://mail-archives.apache.org/mod_mbox/mesos-user/201410.mbox/browser

On Thu, Oct 23, 2014 at 1:42 AM, Connor Doyle <[email protected]> wrote:

> Hi Eduardo,
>
> There is a known defect in Mesos that matches your description:
> https://issues.apache.org/jira/browse/MESOS-1915
> https://issues.apache.org/jira/browse/MESOS-1884
>
> A fix will be included in the next release.
> https://reviews.apache.org/r/26486
>
> You see the killTask because the default --task_launch_timeout value for
> Marathon is 60 seconds.
> Created an issue to make the logging around this better:
> https://github.com/mesosphere/marathon/issues/732
>
> --
> Connor
>
>
> On Oct 22, 2014, at 16:18, Eduardo Jiménez <[email protected]> wrote:
>
> > Hi,
> >
> > I've started experimenting with mesos using the docker containerizer,
> and running a simple example got into a very strange state.
> >
> > I have mesos-0.20.1, marathon-0.7 setup on EC2, using Amazon Linux:
> >
> > Linux <ip> 3.14.20-20.44.amzn1.x86_64 #1 SMP Mon Oct 6 22:52:46 UTC 2014
> x86_64 x86_64 x86_64 GNU/Linux
> >
> > Docker version 1.2.0, build fa7b24f/1.2.0
> >
> > I start the mesos slave with these relevant options:
> >
> > --cgroups_hierarchy=/cgroup
> > --containerizers=docker,mesos
> > --executor_registration_timeout=5mins
> > --isolation=cgroups/cpu,cgroups/mem
> >
> > I launched a very simple app, which is from the mesosphere examples:
> >
> > {
> >   "container": {
> >     "type": "DOCKER",
> >     "docker": {
> >       "image": "libmesos/ubuntu"
> >     }
> >   },
> >   "id": "ubuntu-docker2",
> >   "instances": "1",
> >   "cpus": "0.5",
> >   "mem": "512",
> >   "uris": [],
> >   "cmd": "while sleep 10; do date -u +%T; done"
> > }
> >
> > The app launches, but then mesos states the task is KILLED, yet the
> docker container is STILL running. Here's the sequence of logs from that
> mesos-slave.
> >
> > 1) Task gets created and assigned:
> >
> > I1022 17:44:13.971096 15195 slave.cpp:1002] Got assigned task
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework
> 20141017-172055-3489660938-5050-1603-0000
> > I1022 17:44:13.971367 15195 slave.cpp:1112] Launching task
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework
> 20141017-172055-3489660938-5050-1603-0000
> > I1022 17:44:13.973047 15195 slave.cpp:1222] Queuing task
> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' for executor
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> '20141017-172055-3489660938-5050-1603-0000
> > I1022 17:44:13.989893 15195 docker.cpp:743] Starting container
> 'c1fc27c8-13e9-484f-a30c-cb062ec4c978' for task
> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' (and executor
> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799') of framework
> '20141017-172055-3489660938-5050-1603-0000'
> >
> > So far so good. The log statements right next to "Starting container" is:
> >
> > I1022 17:45:14.893309 15196 slave.cpp:1278] Asked to kill task
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000
> > I1022 17:45:14.894579 15196 slave.cpp:2088] Handling status update
> TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for task
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000 from @0.0.0.0:0
> > W1022 17:45:14.894798 15196 slave.cpp:1354] Killing the unregistered
> executor 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' of framework
> 20141017-172055-3489660938-5050-1603-0000 because it has no tasks
> > E1022 17:45:14.925014 15192 slave.cpp:2205] Failed to update resources
> for container c1fc27c8-13e9-484f-a30c-cb062ec4c978 of executor
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 running task
> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 on status update for
> terminal task, destroying container: No container found
> >
> > After this, there's several log messages like this:
> >
> > I1022 17:45:14.926197 15194 status_update_manager.cpp:320] Received
> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000
> > I1022 17:45:14.926378 15194 status_update_manager.cpp:373] Forwarding
> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000 to [email protected]:5050
> > W1022 17:45:16.169214 15196 status_update_manager.cpp:181] Resending
> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000
> > I1022 17:45:16.169275 15196 status_update_manager.cpp:373] Forwarding
> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
> 20141017-172055-3489660938-5050-1603-0000 to [email protected]:5050
> >
> >
> > Eventually the TASK_KILLED update is acked and the Mesos UI shows the
> task as killed. By then, the process should be dead, but its not.
> >
> > $ sudo docker ps
> > CONTAINER ID        IMAGE                    COMMAND
> CREATED             STATUS              PORTS               NAMES
> > f76784e1af8b        libmesos/ubuntu:latest   "/bin/sh -c 'while s   5
> hours ago         Up 5 hours
> mesos-c1fc27c8-13e9-484f-a30c-cb062ec4c978
> >
> >
> > The container shows in the UI like this:
> >
> > ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799
>  ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799     KILLED  5 hours
> ago     5 hours ago
> > And its been running the whole time.
> >
> > There's no other logging indicating why killTask was invoked, which
> makes this extremely frustrating to debug.
> >
> > Has anyone seen something similar?
> >
> > Thanks,
> >
> > Eduardo
> >
>
>

Reply via email to