Check the mesos-slave log on one of the slaves, in
/var/log/mesos/mesos-slave.INFO. There's probably some information there
about the docker pull, or other things that could have errored before the
actual container is launched.
Alternatively, you could try a `docker pull` manually on one of the slaves,
then see if the launch succeeds on that node. Then you'll know if it was a
timeout during the docker pull, at which point you can either further
increase the registration timeout or decide to pre-pull all your images (as
a periodic Chronos task?), due to unpredictable network latencies in AWS.

On Mon, Feb 23, 2015 at 4:31 PM, max square <[email protected]> wrote:

> Hi all,
>
> I am using the cloudformation scripts
> <https://github.com/mbabineau/cloudformation-mesos>to create a Mesos
> cluster, with Marathon 0.7.5, and Chronos 2.3.2. The setup is working
> perfectly,  for regula processes. However now I am trying to deploy a
> simple docker image, but it is failing without producing any errors in the
> sandbox.
>
> I followed the following tutorial
> <https://mesosphere.com/docs/tutorials/launch-docker-container-on-mesosphere/>
>  to
> set the Mesos Executor Timeout to 5mins and I can see the following
> processes running on all the slave machines. Where the containerizers are
> in the correct order:
>
> root      1615  0.0  0.0    168     4 ?        Ss   Feb20   0:00 runsv
> mesos-slave
>
> root      1616  0.0  0.0    184     4 ?        S    Feb20   0:00 svlogd
> -tt /var/log/mesos-slave
>
> root      1617  3.1  0.2 874688 17376 ?        Sl   Feb20 133:56
> /usr/local/sbin/mesos-slave --log_dir=/var/log/mesos
> --containerizers=docker,mesos
>
> root      3290  0.0  0.0   4444   652 ?        Ss   Feb21   0:00 sh -c
> /usr/local/libexec/mesos/mesos-executor
>
> root      3304  0.0  0.1 720528 10332 ?        Sl   Feb21   1:01
> /usr/local/libexec/mesos mesos-executor
>
> Does anyone have suggestions on how to debug the issue?
>
> Thanks in Advance
>
> Sergio Daniel
>

Reply via email to