I think it is fixed now.

Since I was upgrading from Deimos to native Docker support, our cluster had
launched **and terminated** 1000's of containers using Jenkins -> Marathon
-> Mesos and Deimos.

Now when using native Mesos 0.20.0 docker support, the slave was trying to
recover all the terminated containers.

Fix: List all containers and user docker rm to remove.

docker ps -a | awk '{print $1}' | xargs docker rm















2014-08-28 12:48 GMT+02:00 Javier Ruiz Jiménez <[email protected]>:

> Thanks Jie
>
> Yes I think that is the reason, but why slave gets stuck? what is done
> during the process of recovering docker containers?
>
> The log line "slave.cpp:3195] Finished recovery" never appears when
> containerizers=docker
>
> If using containerizers=docker goes from Recovering containerizer to
> Current usage
>
> I0828 12:29:19.971142  2210 slave.cpp:307] Slave checkpoint: true
> I0828 12:29:19.975369  2208 state.cpp:33] Recovering state from
> '/tmp/mesos/meta'
> I0828 12:29:19.975666  2208 state.cpp:62] Failed to find the latest slave
> from '/tmp/mesos/meta'
> I0828 12:29:19.979894  2205 status_update_manager.cpp:193] Recovering
> status update manager
> I0828 12:29:19.980206  2209 docker.cpp:649] Recovering Docker containers
> I0828 12:29:19.980765  2209 containerizer.cpp:252] Recovering containerizer
> I0828 12:30:19.972507  2206 slave.cpp:3050] Current usage 44.11%. Max
> allowed age: 3.212379842874491days
>
> Javier
>
>
> 2014-08-27 23:30 GMT+02:00 Jie Yu <[email protected]>:
>
>> I0827 22:58:18.789118  3383 docker.cpp:649] Recovering Docker containers
>>
>> Seems that the slave stuck at recovering docker containers.
>>
>> - Jie
>>
>>
>> On Wed, Aug 27, 2014 at 2:00 PM, Javier Ruiz Jiménez <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> My first post in the user list. Writing from Spain.
>>>
>>> I am setting up a Mesos Cluster and after upgrading to 0.20.0 and re
>>> configuring to use native docker functionality (remove Deimos) I am not
>>> able to containerizers=docker
>>>
>>> Releases:
>>> - Mesos-slave: Version: 0.20.0 Build: 2014-08-22 05:05:59 from
>>> repository http://repos.mesosphere.io/
>>> - Mesos-master: Version: 0.20.0 Build: 2014-08-22 05:05:59  from
>>> repository http://repos.mesosphere.io/
>>> - O.S.:  Ubuntu 14.04.1 LTS
>>> - Marathon: Marathon 0.7.0-SNAPSHOT
>>> - Docker version 1.2.0, build fa7b24f
>>>
>>> Installation:
>>> - Single host running mesos-master, mesos-slave, zookeeper and marathon
>>>
>>> If I run mesos-slave with --containerizers=docker,mesos , the mesos UI
>>> doesn't show any slaves or resources. No detection of master.
>>>
>>> If I run mesos-slave with --containerizers=mesos , the mesos UI shows
>>> the slaves and the resources.
>>>
>>> Where can I look for more information?
>>>
>>> I plan to build from repository to see if with the latest code it works.
>>>
>>> I don't see any relevant info in the log:
>>>
>>> Log for --containerizers=docker,mesos:
>>> Command: mesos-slave --master=192.168.0.57:5050
>>> --containerizers=mesos,docker  --hostname=192.168.0.57 --IP=192.168.0.57
>>> --log_dir=/var/log/mesos
>>> ________________________________
>>> I0827 22:58:17.774178  3379 logging.cpp:142] INFO level logging started!
>>> I0827 22:58:17.775691  3379 main.cpp:126] Build: 2014-08-22 05:05:59 by
>>> root
>>> I0827 22:58:17.775956  3379 main.cpp:128] Version: 0.20.0
>>> I0827 22:58:17.776199  3379 main.cpp:131] Git tag: 0.20.0
>>> I0827 22:58:17.776407  3379 main.cpp:135] Git SHA:
>>> f421ffdf8d32a8834b3a6ee483b5b59f65956497
>>> I0827 22:58:17.776654  3379 containerizer.cpp:89] Using isolation:
>>> posix/cpu,posix/mem
>>> I0827 22:58:18.783756  3379 main.cpp:149] Starting Mesos slave
>>> I0827 22:58:18.784618  3387 slave.cpp:167] Slave started on 1)@
>>> 192.168.0.57:5051
>>> I0827 22:58:18.784915  3387 slave.cpp:278] Slave resources: cpus(*):2;
>>> mem(*):15025; disk(*):26760; ports(*):[31000-32000]
>>> I0827 22:58:18.785161  3387 slave.cpp:306] Slave hostname: 192.168.0.57
>>> I0827 22:58:18.785348  3387 slave.cpp:307] Slave checkpoint: true
>>> I0827 22:58:18.787921  3381 state.cpp:33] Recovering state from
>>> '/tmp/mesos/meta'
>>> I0827 22:58:18.788781  3381 status_update_manager.cpp:193] Recovering
>>> status update manager
>>> I0827 22:58:18.789072  3381 containerizer.cpp:252] Recovering
>>> containerizer
>>> I0827 22:58:18.789118  3383 docker.cpp:649] Recovering Docker containers
>>>
>>> Log for --containerizers=mesos:
>>> Command: mesos-slave --master=192.168.0.57:5050 --containerizers=mesos
>>>  --hostname=192.168.0.57 --IP=192.168.0.57 --log_dir=/var/log/mesos
>>> __________________________
>>> I0827 22:53:09.960764  3369 logging.cpp:142] INFO level logging started!
>>> I0827 22:53:09.962329  3369 main.cpp:126] Build: 2014-08-22 05:05:59 by
>>> root
>>> I0827 22:53:09.962749  3369 main.cpp:128] Version: 0.20.0
>>> I0827 22:53:09.963065  3369 main.cpp:131] Git tag: 0.20.0
>>> I0827 22:53:09.963369  3369 main.cpp:135] Git SHA:
>>> f421ffdf8d32a8834b3a6ee483b5b59f65956497
>>> I0827 22:53:09.963709  3369 containerizer.cpp:89] Using isolation:
>>> posix/cpu,posix/mem
>>> I0827 22:53:09.968011  3369 main.cpp:149] Starting Mesos slave
>>> I0827 22:53:09.968703  3377 slave.cpp:167] Slave started on 1)@
>>> 192.168.0.57:5051
>>> I0827 22:53:09.969319  3377 slave.cpp:278] Slave resources: cpus(*):2;
>>> mem(*):15025; disk(*):26760; ports(*):[31000-32000]
>>> I0827 22:53:09.969703  3377 slave.cpp:306] Slave hostname: 192.168.0.57
>>> I0827 22:53:09.969912  3377 slave.cpp:307] Slave checkpoint: true
>>> I0827 22:53:09.972486  3377 state.cpp:33] Recovering state from
>>> '/tmp/mesos/meta'
>>> I0827 22:53:09.972882  3377 state.cpp:62] Failed to find the latest
>>> slave from '/tmp/mesos/meta'
>>> I0827 22:53:09.973140  3377 status_update_manager.cpp:193] Recovering
>>> status update manager
>>> I0827 22:53:09.973368  3377 containerizer.cpp:252] Recovering
>>> containerizer
>>> I0827 22:53:09.973819  3377 slave.cpp:3195] Finished recovery
>>> I0827 22:53:09.974473  3377 slave.cpp:589] New master detected at
>>> [email protected]:5050
>>> I0827 22:53:09.974822  3370 status_update_manager.cpp:167] New master
>>> detected at [email protected]:5050
>>> I0827 22:53:09.974983  3377 slave.cpp:625] No credentials provided.
>>> Attempting to register without authentication
>>> I0827 22:53:09.975147  3377 slave.cpp:636] Detecting new master
>>> I0827 22:53:10.758889  3374 slave.cpp:754] Registered with master
>>> [email protected]:5050; given slave ID
>>> 20140827-224307-956344512-5050-3182-0
>>> I0827 22:53:10.759145  3374 slave.cpp:767] Checkpointing SlaveInfo to
>>> '/tmp/mesos/meta/slaves/20140827-224307-956344512-5050-3182-0/slave.info
>>> '
>>>
>>> Thanks.
>>> --
>>> Javier
>>>
>>
>>
>
>
> --
> Javier Ruiz Jiménez
> Tecsisa
> E: [email protected]
> Tel: +34 91.182.04.71 (directo)
> Tel: +34 91.445.21.15 (centralita)
> Fax: +34 91.447.05.11
> http://www.tecsisa.com
>



-- 
Javier Ruiz Jiménez
Tecsisa
E: [email protected]
Tel: +34 91.182.04.71 (directo)
Tel: +34 91.445.21.15 (centralita)
Fax: +34 91.447.05.11
http://www.tecsisa.com

Reply via email to