Ah, colons in the executorId. What version of Mesos are you running? You
might be hitting https://issues.apache.org/jira/browse/MESOS-1833

On Tue, Feb 24, 2015 at 9:39 AM, max square <[email protected]> wrote:

> Adam,
>
> Thanks for the pointer, I was able to pull the logs for the docker run
> command. Up to my understanding it is actually pulling the image, but it is
> having trouble starting the actual docker, I highlighted in red what I
> think is the main reason for the error: a bad format for the volume where
> mesos wants to mount the volume for the docker. (See logs below)
> I also notice that after the error it can't find the container, is that
> normal?
>
> Do you have any thoughts? Any help is greatly appreciated!
>
> #It all starts well
> I0223 21:08:00.884003  1618 docker.cpp:743] Starting container
> '1f5295b2-9694-40ec-b900-f17de71d3bf4' for task
> 'ct:1424725680000:0:dockerjob:' (and executor
> 'ct:1424725680000:0:dockerjob:') of framework
> '20150222-090257-1326718892-5050-9995-0001'
>
> #Can't use the format for the volume
> E0223 21:09:40.772276  1625 slave.cpp:2485] Container
> '1f5295b2-9694-40ec-b900-f17de71d3bf4' for executor
> 'ct:1424725680000:0:dockerjob:' of framework
> '20150222-090257-1326718892-5050-9995-0001' failed to start: Failed to
> 'docker run -d -c 512 -m 536870912 -e
> mesos_task_id=ct:1424725680000:0:dockerjob: -e CHRONOS_JOB_OWNER= -e
> CHRONOS_JOB_NAME=dockerjob -e HOST=ec2-52-1-xx-xx.compute-1.amazonaws.com
> -e CHRONOS_RESOURCE_MEM=512.0 -e CHRONOS_RESOURCE_CPU=0.5 -e
> CHRONOS_RESOURCE_DISK=256.0 -e MESOS_SANDBOX=/mnt/mesos/sandbox -v
> /tmp/mesos/slaves/20150220-234013-1326718892-5050-1615-2/frameworks/20150222-090257-1326718892-5050-9995-0001/executors/ct:1424725680000:0:dockerjob:/runs/1f5295b2-9694-40ec-b900-f17de71d3bf4:/mnt/mesos/sandbox
> --net host --entrypoint /bin/sh --name
> mesos-1f5295b2-9694-40ec-b900-f17de71d3bf4 libmesos/ubuntu -c while sleep
> 10; do date -u +%T; done': exit status = exited with status 2 stderr =
> invalid value
> "/tmp/mesos/slaves/20150220-234013-1326718892-5050-1615-2/frameworks/20150222-090257-1326718892-5050-9995-0001/executors/ct:1424725680000:0:dockerjob:/runs/1f5295b2-9694-40ec-b900-f17de71d3bf4:/mnt/mesos/sandbox"
> for flag -v: bad format for volumes:
> /tmp/mesos/slaves/20150220-234013-1326718892-5050-1615-2/frameworks/20150222-090257-1326718892-5050-9995-0001/executors/ct:1424725680000:0:dockerjob:/runs/1f5295b2-9694-40ec-b900-f17de71d3bf4:/mnt/mesos/sandbox
>
> Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
>                                'bridge': creates a new network stack for
> the container on the docker bridge
>                                (use 'docker port' to see the actual
> mapping)
>
> # Fails and can't destroy the container, is this normal?
> E0223 21:09:40.772655  1625 slave.cpp:2580] Termination of executor
> 'ct:1424725680000:0:dockerjob:' of framework
> '20150222-090257-1326718892-5050-9995-0001' failed: No container found
>
> Thanks in advance
>
> Sergio Daniel
>
> On Tue, Feb 24, 2015 at 1:51 AM, Adam Bordelon <[email protected]> wrote:
>
>> Check the mesos-slave log on one of the slaves, in
>> /var/log/mesos/mesos-slave.INFO. There's probably some information there
>> about the docker pull, or other things that could have errored before the
>> actual container is launched.
>> Alternatively, you could try a `docker pull` manually on one of the
>> slaves, then see if the launch succeeds on that node. Then you'll know if
>> it was a timeout during the docker pull, at which point you can either
>> further increase the registration timeout or decide to pre-pull all your
>> images (as a periodic Chronos task?), due to unpredictable network
>> latencies in AWS.
>>
>> On Mon, Feb 23, 2015 at 4:31 PM, max square <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> I am using the cloudformation scripts
>>> <https://github.com/mbabineau/cloudformation-mesos>to create a Mesos
>>> cluster, with Marathon 0.7.5, and Chronos 2.3.2. The setup is working
>>> perfectly,  for regula processes. However now I am trying to deploy a
>>> simple docker image, but it is failing without producing any errors in the
>>> sandbox.
>>>
>>> I followed the following tutorial
>>> <https://mesosphere.com/docs/tutorials/launch-docker-container-on-mesosphere/>
>>>  to
>>> set the Mesos Executor Timeout to 5mins and I can see the following
>>> processes running on all the slave machines. Where the containerizers are
>>> in the correct order:
>>>
>>> root      1615  0.0  0.0    168     4 ?        Ss   Feb20   0:00 runsv
>>> mesos-slave
>>>
>>> root      1616  0.0  0.0    184     4 ?        S    Feb20   0:00 svlogd
>>> -tt /var/log/mesos-slave
>>>
>>> root      1617  3.1  0.2 874688 17376 ?        Sl   Feb20 133:56
>>> /usr/local/sbin/mesos-slave --log_dir=/var/log/mesos
>>> --containerizers=docker,mesos
>>>
>>> root      3290  0.0  0.0   4444   652 ?        Ss   Feb21   0:00 sh -c
>>> /usr/local/libexec/mesos/mesos-executor
>>>
>>> root      3304  0.0  0.1 720528 10332 ?        Sl   Feb21   1:01
>>> /usr/local/libexec/mesos mesos-executor
>>>
>>> Does anyone have suggestions on how to debug the issue?
>>>
>>> Thanks in Advance
>>>
>>> Sergio Daniel
>>>
>>
>>
>

Reply via email to