What is the last command you have docker doing?

If that command exits then the docker will begin to end the container.

-Jason

> On Apr 17, 2015, at 3:23 PM, Tyson Norris <tnor...@adobe.com> wrote:
> 
> Hi -
> I am looking at revving the mesos-storm framework to be dockerized (and 
> simpler). 
> I’m using mesos 0.22.0-1.0.ubuntu1404
> mesos master + mesos slave are deployed in docker containers, in case it 
> matters. 
> 
> I have the storm (nimbus) framework launching fine as a docker container, but 
> launching tasks for a topology is having problems related to using a 
> docker-based executor.
> 
> For example. 
> 
> TaskInfo task = TaskInfo.newBuilder()
>    .setName("worker " + slot.getNodeId() + ":" + slot.getPort())
>    .setTaskId(taskId)
>    .setSlaveId(offer.getSlaveId())
>    .setExecutor(ExecutorInfo.newBuilder()
>                    
> .setExecutorId(ExecutorID.newBuilder().setValue(details.getId()))
>                    .setData(ByteString.copyFromUtf8(executorDataStr))
>                    .setContainer(ContainerInfo.newBuilder()
>                            .setType(ContainerInfo.Type.DOCKER)
>                            .setDocker(ContainerInfo.DockerInfo.newBuilder()
>                                            .setImage("mesos-storm”)))
>                    
> .setCommand(CommandInfo.newBuilder().setShell(true).setValue("storm 
> supervisor storm.mesos.MesosSupervisor"))
>        //rest is unchanged from existing mesos-storm framework code
> 
> The executor launches and exits quickly - see the log msg:  Executor for 
> container '88ce3658-7d9c-4b5f-b69a-cb5e48125dfd' has exited
> 
> It seems like mesos loses track of the executor? I understand there is a 1 
> min timeout on registering the executor, but the exit happens well before 1 
> minute.
> 
> I tried a few alternate commands to experiment, and I can see in the stdout 
> for the task that
> "echo testing123 && echo testing456” 
> prints to stdout correctly, both testing123 and testing456
> 
> however:
> "echo testing123a && sleep 10 && echo testing456a” 
> prints only testing123a, presumably because the container is lost and 
> destroyed before the sleep time is up.
> 
> So it’s like the container for the executor is only allowed to run for .5 
> seconds, then it is detected as exited, and the task is lost. 
> 
> Thanks for any advice.
> 
> Tyson
> 
> 
> 
> slave logs look like:
> mesosslave_1  | I0417 19:07:27.461230    11 slave.cpp:1121] Got assigned task 
> mesos-slave1.service.consul-31000 for framework 
> 20150417-190611-2801799596-5050-1-0000
> mesosslave_1  | I0417 19:07:27.461479    11 slave.cpp:1231] Launching task 
> mesos-slave1.service.consul-31000 for framework 
> 20150417-190611-2801799596-5050-1-0000
> mesosslave_1  | I0417 19:07:27.463250    11 slave.cpp:4160] Launching 
> executor insights-1-1429297638 of framework 
> 20150417-190611-2801799596-5050-1-0000 in work directory 
> '/tmp/mesos/slaves/20150417-190611-2801799596-5050-1-S0/frameworks/20150417-190611-2801799596-5050-1-0000/executors/insights-1-1429297638/runs/6539127f-9dbb-425b-86a8-845b748f0cd3'
> mesosslave_1  | I0417 19:07:27.463444    11 slave.cpp:1378] Queuing task 
> 'mesos-slave1.service.consul-31000' for executor insights-1-1429297638 of 
> framework '20150417-190611-2801799596-5050-1-0000
> mesosslave_1  | I0417 19:07:27.467200     7 docker.cpp:755] Starting 
> container '6539127f-9dbb-425b-86a8-845b748f0cd3' for executor 
> 'insights-1-1429297638' and framework '20150417-190611-2801799596-5050-1-0000'
> mesosslave_1  | I0417 19:07:27.985935     7 docker.cpp:1333] Executor for 
> container '6539127f-9dbb-425b-86a8-845b748f0cd3' has exited
> mesosslave_1  | I0417 19:07:27.986359     7 docker.cpp:1159] Destroying 
> container '6539127f-9dbb-425b-86a8-845b748f0cd3'
> mesosslave_1  | I0417 19:07:27.986021     9 slave.cpp:3135] Monitoring 
> executor 'insights-1-1429297638' of framework 
> '20150417-190611-2801799596-5050-1-0000' in container 
> '6539127f-9dbb-425b-86a8-845b748f0cd3'
> mesosslave_1  | I0417 19:07:27.986464     7 docker.cpp:1248] Running docker 
> stop on container '6539127f-9dbb-425b-86a8-845b748f0cd3'
> mesosslave_1  | I0417 19:07:28.286761    10 slave.cpp:3186] Executor 
> 'insights-1-1429297638' of framework 20150417-190611-2801799596-5050-1-0000 
> has terminated with unknown status
> mesosslave_1  | I0417 19:07:28.288784    10 slave.cpp:2508] Handling status 
> update TASK_LOST (UUID: 0795a58b-f487-42e2-aaa1-a26fe6834ed7) for task 
> mesos-slave1.service.consul-31000 of framework 
> 20150417-190611-2801799596-5050-1-0000 from @0.0.0.0:0
> mesosslave_1  | W0417 19:07:28.289227     9 docker.cpp:841] Ignoring updating 
> unknown container: 6539127f-9dbb-425b-86a8-845b748f0cd3
> 
> nimbus logs (framework) look like:
> 2015-04-17T19:07:28.302+0000 s.m.MesosNimbus [INFO] Received status update: 
> task_id {
>  value: "mesos-slave1.service.consul-31000"
> }
> state: TASK_LOST
> message: "Container terminated"
> slave_id {
>  value: "20150417-190611-2801799596-5050-1-S0"
> }
> timestamp: 1.429297648286981E9
> source: SOURCE_SLAVE
> reason: REASON_EXECUTOR_TERMINATED
> 11: "\a\225\245\213\364\207B\342\252\241\242o\346\203N\327"
> 
> 
> 

Reply via email to