Hi Tim -
I am using my own framework - a modified version of mesos-storm, attempting to
use docker containers instead of
TaskInfo is like:
TaskInfo task = TaskInfo.newBuilder()
.setName("worker " + slot.getNodeId() + ":" + slot.getPort())
.setTaskId(taskId)
.setSlaveId(offer.getSlaveId())
.setExecutor(ExecutorInfo.newBuilder()
.setExecutorId(ExecutorID.newBuilder().setValue(details.getId()))
.setData(ByteString.copyFromUtf8(executorDataStr))
.setCommand(CommandInfo.newBuilder()
.setShell(false)
)
.setContainer(ContainerInfo.newBuilder()
.setType(ContainerInfo.Type.DOCKER)
.setDocker(ContainerInfo.DockerInfo.newBuilder()
.setImage("testexecutor")
)
)
I understand this test image will be expected to fail - I expect it to fail by
registration timeout, and not by simply dying though. I’m only using a test
image, because I see the same behavior with my actual image that properly
handles mesos - executor registration protocol.
I will try moving the Container inside the Command, and see if it survives
longer.
I see now at
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L675
it mentions "Either ExecutorInfo or CommandInfo should be set”
Thanks
Tyson
On Apr 18, 2015, at 12:38 PM, Tim Chen
<[email protected]<mailto:[email protected]>> wrote:
That does seems odd, how did you run this via mesos? Are you using your own
framework or through another framework like Marathon?
And what does the TaskInfo look like?
Also note that if you're just testing a container, you don't want to set the
ExecutorInfo with a command as Executors in Mesos are expected to communicate
back to Mesos slave and implement the protocol between mesos and executor. For
a test image like this you want to set the CommandInfo with a ContainerInfo
holding the docker image instead.
Tim
On Sat, Apr 18, 2015 at 12:17 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
Hi Tim -
Yes, I mentioned below when using a script like:
--------------------------------------
#!/bin/bash
until false; do
echo "waiting for something to do something"
sleep 0.2
done
--------------------------------------
In my sandbox stdout I get exactly 2 lines:
waiting for something to do something
waiting for something to do something
Running this container any other way, e.g. docker run --rm -it testexecutor,
the output is an endless stream of "waiting for something to do something”.
So something is stopping the container, as opposed to the container just
exiting; at least that’s how it looks - I only get the container to stop when
it is launched as an executor.
Also, based on the docker logs, something is calling the /container/id/stop
endpoint, *before* the /container/id/logs endpoint - so the stop is arriving
before the logs are tailed, which also seems incorrect, and suggests that there
is some code explicating stopping the container, instead of the container
exiting itself.
Thanks
Tyson
On Apr 18, 2015, at 3:33 AM, Tim Chen
<[email protected]<mailto:[email protected]>> wrote:
Hi Tyson,
The error message you saw in the logs about the executor exited actually just
means the executor process has exited.
Since you're launching a custom executor with MesosSupervisor, it seems like
MesosSupervisor simply exited without reporting any task status.
Can you look at what's the actual logs of the container? They can be found in
the sandbox stdout and stderr logs.
Tim
On Fri, Apr 17, 2015 at 11:16 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
The sequence I see in the docker.log when my executor is launched is something
like:
GET /containers/id/json
POST /containers/id/wait
POST /containers/id/stop
GET /containers/id/logs
So I’m wondering if the slave is calling docker->stop out of order in
slave/containerizer/docker.cpp
I only see it being called in recover and destroy and I don’t see logs
indicating either of those happening, but I may be missing something else
Tyson
On Apr 17, 2015, at 9:42 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
mesos master INFO log says:
I0418 04:26:31.573763 6 master.cpp:3755] Sending 1 offers to framework
20150411-165219-771756460-5050-1-0000 (marathon) at
[email protected]<mailto:[email protected]>:44364
I0418 04:26:31.580003 9 master.cpp:2268] Processing ACCEPT call for offers:
[ 20150418-041001-553718188-5050-1-O165 ] on slave
20150418-041001-553718188-5050-1-S0 at
slave(1)@172.17.1.35:5051<http://172.17.1.35:5051/>
(mesos-slave1.service.consul) for framework
20150411-165219-771756460-5050-1-0000 (marathon) at
[email protected]<mailto:[email protected]>:44364
I0418 04:26:31.580369 9 hierarchical.hpp:648] Recovered cpus(*):6;
mem(*):3862; disk(*):13483; ports(*):[31001-32000] (total allocatable:
cpus(*):6; mem(*):3862; disk(*):13483; ports(*):[31001-32000]) on slave
20150418-041001-553718188-5050-1-S0 from framework
20150411-165219-771756460-5050-1-0000
I0418 04:26:32.480036 12 master.cpp:3388] Executor insights-1-1429330829 of
framework 20150418-041001-553718188-5050-1-0001 on slave
20150418-041001-553718188-5050-1-S0 at
slave(1)@172.17.1.35:5051<http://172.17.1.35:5051/>
(mesos-slave1.service.consul) terminated with signal Unknown signal 127
mesos slave INFO log says:
I0418 04:26:31.390650 8 slave.cpp:1231] Launching task
mesos-slave1.service.consul-31000 for framework
20150418-041001-553718188-5050-1-0001
I0418 04:26:31.392432 8 slave.cpp:4160] Launching executor
insights-1-1429330829 of framework 20150418-041001-553718188-5050-1-0001 in
work directory '/tmp/mesos/slaves/20150418-041001-553718188-5050-
1-S0/frameworks/20150418-041001-553718188-5050-1-0001/executors/insights-1-1429330829/runs/3cc411b0-c2e0-41ae-80c2-f0306371da5a'
I0418 04:26:31.392587 8 slave.cpp:1378] Queuing task
'mesos-slave1.service.consul-31000' for executor insights-1-1429330829 of
framework '20150418-041001-553718188-5050-1-0001
I0418 04:26:31.397415 7 docker.cpp:755] Starting container
'3cc411b0-c2e0-41ae-80c2-f0306371da5a' for executor 'insights-1-1429330829' and
framework '20150418-041001-553718188-5050-1-0001'
I0418 04:26:31.397835 7 fetcher.cpp:238] Fetching URIs using command
'/usr/libexec/mesos/mesos-fetcher'
I0418 04:26:32.177479 11 docker.cpp:1333] Executor for container
'3cc411b0-c2e0-41ae-80c2-f0306371da5a' has exited
I0418 04:26:32.177817 11 docker.cpp:1159] Destroying container
'3cc411b0-c2e0-41ae-80c2-f0306371da5a'
I0418 04:26:32.177999 11 docker.cpp:1248] Running docker stop on container
'3cc411b0-c2e0-41ae-80c2-f0306371da5a'
I0418 04:26:32.177620 6 slave.cpp:3135] Monitoring executor
'insights-1-1429330829' of framework '20150418-041001-553718188-5050-1-0001' in
container '3cc411b0-c2e0-41ae-80c2-f0306371da5a'
I0418 04:26:32.477990 12 slave.cpp:3186] Executor 'insights-1-1429330829' of
framework 20150418-041001-553718188-5050-1-0001 has terminated with unknown
status
I0418 04:26:32.479394 12 slave.cpp:2508] Handling status update TASK_LOST
(UUID: 9dbc3859-0409-47b4-888f-2871b0b48dfa) for task
mesos-slave1.service.consul-31000 of framework 20150418-041001-553718188-
5050-1-0001 from @0.0.0.0:0<http://0.0.0.0:0/>
W0418 04:26:32.479645 12 docker.cpp:841] Ignoring updating unknown
container: 3cc411b0-c2e0-41ae-80c2-f0306371da5a
I0418 04:26:32.480041 10 status_update_manager.cpp:317] Received status
update TASK_LOST (UUID: 9dbc3859-0409-47b4-888f-2871b0b48dfa) for task
mesos-slave1.service.consul-31000 of framework 20150418-04
1001-553718188-5050-1-0001
I0418 04:26:32.481073 12 slave.cpp:2753] Forwarding the update TASK_LOST
(UUID: 9dbc3859-0409-47b4-888f-2871b0b48dfa) for task
mesos-slave1.service.consul-31000 of framework 20150418-041001-553718188-5
050-1-0001 to [email protected]<mailto:[email protected]>:5050
docker.log says
time="2015-04-18T04:26:31Z" level=debug msg="Calling POST /containers/create"
time="2015-04-18T04:26:31Z" level=info msg="POST
/v1.18/containers/create?name=mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a"
time="2015-04-18T04:26:31Z" level=info msg="+job
create(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:31Z" level=info msg="+job log(create,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest)"
time="2015-04-18T04:26:31Z" level=info msg="-job log(create,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest) = OK (0)"
time="2015-04-18T04:26:31Z" level=info msg="-job
create(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:31Z" level=debug msg="Calling POST
/containers/{name:.*}/start"
time="2015-04-18T04:26:31Z" level=info msg="POST
/v1.18/containers/4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4/start"
time="2015-04-18T04:26:31Z" level=info msg="+job
start(4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4)"
time="2015-04-18T04:26:31Z" level=info msg="+job log(start,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest)"
time="2015-04-18T04:26:31Z" level=info msg="-job log(start,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest) = OK (0)"
time="2015-04-18T04:26:31Z" level=debug msg="Calling GET
/containers/{name:.*}/json"
time="2015-04-18T04:26:31Z" level=info msg="GET
/containers/4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4/json"
time="2015-04-18T04:26:31Z" level=info msg="+job
container_inspect(4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4)"
time="2015-04-18T04:26:32Z" level=info msg="-job
start(4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4) = OK
(0)"
time="2015-04-18T04:26:32Z" level=info msg="-job
container_inspect(4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4)
= OK (0)"
time="2015-04-18T04:26:32Z" level=debug msg="Calling GET
/containers/{name:.*}/json"
time="2015-04-18T04:26:32Z" level=info msg="GET
/v1.18/containers/mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a/json"
time="2015-04-18T04:26:32Z" level=info msg="+job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=info msg="-job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:32Z" level=debug msg="Calling GET
/containers/{name:.*}/json"
time="2015-04-18T04:26:32Z" level=info msg="GET
/v1.18/containers/mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a/json"
time="2015-04-18T04:26:32Z" level=info msg="+job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=info msg="-job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:32Z" level=debug msg="Calling POST
/containers/{name:.*}/wait"
time="2015-04-18T04:26:32Z" level=info msg="POST
/v1.18/containers/mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a/wait"
time="2015-04-18T04:26:32Z" level=info msg="+job
wait(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=debug msg="Calling GET
/containers/{name:.*}/logs"
time="2015-04-18T04:26:32Z" level=info msg="GET
/v1.18/containers/mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a/logs?follow=1&stderr=1&stdout=1&tail=all"
time="2015-04-18T04:26:32Z" level=info msg="+job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=info msg="-job
container_inspect(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:32Z" level=info msg="+job
logs(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=debug msg="Calling POST
/containers/{name:.*}/stop"
time="2015-04-18T04:26:32Z" level=info msg="POST
/v1.18/containers/mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a/stop?t=0"
time="2015-04-18T04:26:32Z" level=info msg="+job
stop(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a)"
time="2015-04-18T04:26:32Z" level=debug msg="Sending 15 to
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4"
time="2015-04-18T04:26:32Z" level=info msg="Container
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4 failed to exit
within 0 seconds of SIGTERM - using the force"
time="2015-04-18T04:26:32Z" level=debug msg="Sending 9 to
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4"
time="2015-04-18T04:26:32Z" level=info msg="+job log(die,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest)"
time="2015-04-18T04:26:32Z" level=info msg="-job log(die,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest) = OK (0)"
time="2015-04-18T04:26:32Z" level=info msg="-job
logs(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:32Z" level=info msg="-job
wait(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)"
time="2015-04-18T04:26:32Z" level=info msg="+job log(stop,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest)"
time="2015-04-18T04:26:32Z" level=info msg="-job log(stop,
4e8320cb2a8e4ede5fb5ae386866addfe008c0035397fe44b84f401e959f96f4,
testexecutor:latest) = OK (0)"
time="2015-04-18T04:26:32Z" level=info msg="-job
stop(mesos-3cc411b0-c2e0-41ae-80c2-f0306371da5a) = OK (0)”
I don’t see a syslog for the master/slave containers
Thanks
Tyson
On Apr 17, 2015, at 7:07 PM, Jason Giedymin
<[email protected]<mailto:[email protected]>> wrote:
What do any/all logs say? (syslog)
-Jason
On Apr 17, 2015, at 7:22 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
another interesting fact:
I can restart the docker container of my executor, and it runs great.
In the test example below, notice the stdout appears to be growing as expected
after restarting the container.
So something is killing my executor container (also indicated by the "Exited
(137) About a minute ago”), but I’m still not sure what.
Thanks
Tyson
tnorris-osx:insights tnorris$ docker ps -a | grep testexec
5291fe29c9c2 testexecutor:latest
"/bin/sh -c executor About a minute ago Exited (137) About a
minute ago
mesos-f573677c-d0ee-4aa0-abba-40b7efc7cfe9
tnorris-osx:insights tnorris$ docker start
mesos-f573677c-d0ee-4aa0-abba-40b7efc7cfe9
mesos-f573677c-d0ee-4aa0-abba-40b7efc7cfe9
tnorris-osx:insights tnorris$ docker logs
mesos-f573677c-d0ee-4aa0-abba-40b7efc7cfe9
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
waiting for something to do something
tnorris-osx:insights tnorris$ docker stop
mesos-f573677c-d0ee-4aa0-abba-40b7efc7cfe9
On Apr 17, 2015, at 2:11 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
You can reproduce with most any dockerfile, I think - it seems like launching a
customer executor that is a docker container has some problem.
I just made a simple test with docker file:
--------------------------------------
#this is oracle java8 atop phusion baseimage
FROM opentable/baseimage-java8:latest
#mesos lib (not used here, but will be in our “real” executor, e.g. to register
the executor etc)
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E56151BF
RUN echo "deb http://repos.mesosphere.io/$(lsb_release -is | tr '[:upper:]'
'[:lower:]') $(lsb_release -cs) main" | tee
/etc/apt/sources.list.d/mesosphere.list
RUN cat /etc/apt/sources.list.d/mesosphere.list
RUN apt-get update && apt-get install -y \
mesos
ADD script.sh /usr/bin/executor-script.sh
CMD executor-script.sh
--------------------------------------
and script.sh:
--------------------------------------
#!/bin/bash
until false; do
echo "waiting for something to do something"
sleep 0.2
done
--------------------------------------
And in my stdout I get exactly 2 lines:
waiting for something to do something
waiting for something to do something
Which is how many lines can be output in within 0.5 seconds…something is fishy
about the 0.5 seconds, but I’m not sure where.
I’m not sure exactly the difference, but launching a docker container as a task
WITHOUT a custom executor works fine, and I’m not sure about launching a docker
container as a task that is using a non-docker custom executor. The case I’m
trying for is using a docker customer executor, and launching non-docker tasks.
(in case that helps clarify the situation).
Thanks
Tyson
On Apr 17, 2015, at 1:47 PM, Jason Giedymin
<[email protected]<mailto:[email protected]>> wrote:
Try:
until <something>; do
echo "waiting for something to do something"
sleep 5
done
You can put this in a bash file and run that.
If you have a dockerfile would be easier to debug.
-Jason
On Apr 17, 2015, at 4:24 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
Yes, agreed that the command should not exit - but the container is killed at
around 0.5 s after launch regardless of whether the command terminates, which
is why I’ve been experimenting using commands with varied exit times.
For example, forget about the executor needing to register momentarily.
Using the command:
echo testing123c && sleep 0.1 && echo testing456c
-> I see the expected output in stdout, and the container is destroyed (as
expected), because the container exits quickly, and then is destroyed
Using the command:
echo testing123d && sleep 0.6 && echo testing456d
-> I do NOT see the expected output in stdout (I only get testing123d), because
the container is destroyed prematurely after ~0.5 seconds
Using the “real” storm command, I get no output in stdout, probably because no
output is generated within 0.5 seconds of launch - it is a bit of a pig to
startup, so I’m currently just trying to execute some other commands for
testing purposes.
So I’m guessing this is a timeout issue, or else that the container is reaped
inappropriately, or something else… looking through this code, I’m trying to
figure out the steps take during executor launch:
https://github.com/apache/mesos/blob/00318fc1b30fc0961c2dfa4d934c37866577d801/src/slave/containerizer/docker.cpp#L715
Thanks
Tyson
On Apr 17, 2015, at 12:53 PM, Jason Giedymin
<[email protected]<mailto:[email protected]>> wrote:
What is the last command you have docker doing?
If that command exits then the docker will begin to end the container.
-Jason
On Apr 17, 2015, at 3:23 PM, Tyson Norris
<[email protected]<mailto:[email protected]>> wrote:
Hi -
I am looking at revving the mesos-storm framework to be dockerized (and
simpler).
I’m using mesos 0.22.0-1.0.ubuntu1404
mesos master + mesos slave are deployed in docker containers, in case it
matters.
I have the storm (nimbus) framework launching fine as a docker container, but
launching tasks for a topology is having problems related to using a
docker-based executor.
For example.
TaskInfo task = TaskInfo.newBuilder()
.setName("worker " + slot.getNodeId() + ":" + slot.getPort())
.setTaskId(taskId)
.setSlaveId(offer.getSlaveId())
.setExecutor(ExecutorInfo.newBuilder()
.setExecutorId(ExecutorID.newBuilder().setValue(details.getId()))
.setData(ByteString.copyFromUtf8(executorDataStr))
.setContainer(ContainerInfo.newBuilder()
.setType(ContainerInfo.Type.DOCKER)
.setDocker(ContainerInfo.DockerInfo.newBuilder()
.setImage("mesos-storm”)))
.setCommand(CommandInfo.newBuilder().setShell(true).setValue("storm supervisor
storm.mesos.MesosSupervisor"))
//rest is unchanged from existing mesos-storm framework code
The executor launches and exits quickly - see the log msg: Executor for
container '88ce3658-7d9c-4b5f-b69a-cb5e48125dfd' has exited
It seems like mesos loses track of the executor? I understand there is a 1 min
timeout on registering the executor, but the exit happens well before 1 minute.
I tried a few alternate commands to experiment, and I can see in the stdout for
the task that
"echo testing123 && echo testing456”
prints to stdout correctly, both testing123 and testing456
however:
"echo testing123a && sleep 10 && echo testing456a”
prints only testing123a, presumably because the container is lost and destroyed
before the sleep time is up.
So it’s like the container for the executor is only allowed to run for .5
seconds, then it is detected as exited, and the task is lost.
Thanks for any advice.
Tyson
slave logs look like:
mesosslave_1 | I0417 19:07:27.461230 11 slave.cpp:1121] Got assigned task
mesos-slave1.service.consul-31000 for framework
20150417-190611-2801799596-5050-1-0000
mesosslave_1 | I0417 19:07:27.461479 11 slave.cpp:1231] Launching task
mesos-slave1.service.consul-31000 for framework
20150417-190611-2801799596-5050-1-0000
mesosslave_1 | I0417 19:07:27.463250 11 slave.cpp:4160] Launching executor
insights-1-1429297638 of framework 20150417-190611-2801799596-5050-1-0000 in
work directory
'/tmp/mesos/slaves/20150417-190611-2801799596-5050-1-S0/frameworks/20150417-190611-2801799596-5050-1-0000/executors/insights-1-1429297638/runs/6539127f-9dbb-425b-86a8-845b748f0cd3'
mesosslave_1 | I0417 19:07:27.463444 11 slave.cpp:1378] Queuing task
'mesos-slave1.service.consul-31000' for executor insights-1-1429297638 of
framework '20150417-190611-2801799596-5050-1-0000
mesosslave_1 | I0417 19:07:27.467200 7 docker.cpp:755] Starting container
'6539127f-9dbb-425b-86a8-845b748f0cd3' for executor 'insights-1-1429297638' and
framework '20150417-190611-2801799596-5050-1-0000'
mesosslave_1 | I0417 19:07:27.985935 7 docker.cpp:1333] Executor for
container '6539127f-9dbb-425b-86a8-845b748f0cd3' has exited
mesosslave_1 | I0417 19:07:27.986359 7 docker.cpp:1159] Destroying
container '6539127f-9dbb-425b-86a8-845b748f0cd3'
mesosslave_1 | I0417 19:07:27.986021 9 slave.cpp:3135] Monitoring executor
'insights-1-1429297638' of framework '20150417-190611-2801799596-5050-1-0000'
in container '6539127f-9dbb-425b-86a8-845b748f0cd3'
mesosslave_1 | I0417 19:07:27.986464 7 docker.cpp:1248] Running docker
stop on container '6539127f-9dbb-425b-86a8-845b748f0cd3'
mesosslave_1 | I0417 19:07:28.286761 10 slave.cpp:3186] Executor
'insights-1-1429297638' of framework 20150417-190611-2801799596-5050-1-0000 has
terminated with unknown status
mesosslave_1 | I0417 19:07:28.288784 10 slave.cpp:2508] Handling status
update TASK_LOST (UUID: 0795a58b-f487-42e2-aaa1-a26fe6834ed7) for task
mesos-slave1.service.consul-31000 of framework
20150417-190611-2801799596-5050-1-0000 from @0.0.0.0:0<http://0.0.0.0:0/>
mesosslave_1 | W0417 19:07:28.289227 9 docker.cpp:841] Ignoring updating
unknown container: 6539127f-9dbb-425b-86a8-845b748f0cd3
nimbus logs (framework) look like:
2015-04-17T19:07:28.302+0000 s.m.MesosNimbus [INFO] Received status update:
task_id {
value: "mesos-slave1.service.consul-31000"
}
state: TASK_LOST
message: "Container terminated"
slave_id {
value: "20150417-190611-2801799596-5050-1-S0"
}
timestamp: 1.429297648286981E9
source: SOURCE_SLAVE
reason: REASON_EXECUTOR_TERMINATED
11: "\a\225\245\213\364\207B\342\252\241\242o\346\203N\327"