Hi
Running Mesos 0.22.1, on a setup with 2 docker slaves: One is running on kernel
3.13.0, with docker 1.6.2, and the other on kernel 3.14.5, with docker 1.7.1. I
am able to run Marathon tasks on the first one, e.g.,
{
"id": "hello -sleep",
"cmd": "while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done",
"cpus": 0.1,
"mem": 10.0,
"instances": 1,
}
However, trying to run the same task on the 2nd docker, leads to indefinite
wait in deployment. All I can gather, in terms of log, is the following on the
2nd slave:
Log file created at: 2015/08/07 17:49:05
Running on machine: instance-000003e2
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0807 17:49:05.882333 565 logging.cpp:172] INFO level logging started!
I0807 17:49:05.883029 565 main.cpp:156] Build: 2015-05-05 06:15:50 by root
I0807 17:49:05.883050 565 main.cpp:158] Version: 0.22.1
I0807 17:49:05.883061 565 main.cpp:161] Git tag: 0.22.1
I0807 17:49:05.883071 565 main.cpp:165] Git SHA:
d6309f92a7f9af3ab61a878403e3d9c284ea87e0
I0807 17:49:05.883919 565 containerizer.cpp:110] Using isolation:
posix/cpu,posix/mem
I0807 17:49:05.884652 565 main.cpp:200] Starting Mesos slave
I0807 17:49:05.886327 576 slave.cpp:174] Slave started on 1)@10.40.50.117:5051
I0807 17:49:05.887329 576 slave.cpp:322] Slave resources: cpus(*):8;
mem(*):14928; disk(*):4975; ports(*):[31000-32000]
I0807 17:49:05.887958 576 slave.cpp:351] Slave hostname: 10.40.50.117
I0807 17:49:05.887979 576 slave.cpp:352] Slave checkpoint: true
I0807 17:49:05.891291 571 state.cpp:35] Recovering state from
'/tmp/mesos/meta'
I0807 17:49:05.891484 577 status_update_manager.cpp:197] Recovering status
update manager
I0807 17:49:05.891784 570 containerizer.cpp:307] Recovering containerizer
I0807 17:49:05.892216 577 slave.cpp:3808] Finished recovery
I0807 17:49:05.915279 574 group.cpp:313] Group process
(group(1)@10.40.50.117:5051) connected to ZooKeeper
I0807 17:49:05.915360 574 group.cpp:790] Syncing group operations: queue size
(joins, cancels, datas) = (0, 0, 0)
I0807 17:49:05.915385 574 group.cpp:385] Trying to create path '/mesos' in
ZooKeeper
I0807 17:49:05.919221 574 detector.cpp:138] Detected a new leader: (id='102')
I0807 17:49:05.919374 571 group.cpp:659] Trying to get
'/mesos/info_0000000102' in ZooKeeper
I0807 17:49:05.921257 571 detector.cpp:452] A new leading master
([email protected]:5050) is detected
I0807 17:49:05.921408 571 slave.cpp:647] New master detected at
[email protected]:5050
I0807 17:49:05.921423 573 status_update_manager.cpp:171] Pausing sending
status updates
I0807 17:49:05.921733 571 slave.cpp:672] No credentials provided. Attempting
to register without authentication
I0807 17:49:05.922029 571 slave.cpp:683] Detecting new master
I0807 17:49:06.870721 571 slave.cpp:815] Registered with master
[email protected]:5050; given slave ID 20150807-174737-1982998538-5050-1871-S1
I0807 17:49:06.870865 573 status_update_manager.cpp:178] Resuming sending
status updates
I0807 17:50:05.901607 577 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562342327913970days
...
I0807 17:55:05.965945 575 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562339580158692days
I0807 17:55:37.012629 575 http.cpp:331] HTTP request for
'/slave(1)/state.json'
I0807 17:56:05.966291 575 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562339580158692days
I0807 17:56:11.760229 577 slave.cpp:1144] Got assigned task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 for framework
20150624-232916-16777343-5050-1628-0000
I0807 17:56:11.762622 577 slave.cpp:1254] Launching task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 for framework
20150624-232916-16777343-5050-1628-0000
I0807 17:56:11.772568 577 slave.cpp:4208] Launching executor
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 in work directory
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f'
I0807 17:56:11.773077 574 containerizer.cpp:484] Starting container
'260e0c23-835a-48a8-ab40-c9566077373f' for executor
'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' of framework
'20150624-232916-16777343-5050-1628-0000'
I0807 17:56:11.773815 577 slave.cpp:1401] Queuing task
'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' for executor
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
'20150624-232916-16777343-5050-1628-0000
I0807 17:56:11.776347 574 launcher.cpp:130] Forked child with pid '599' for
container '260e0c23-835a-48a8-ab40-c9566077373f'
I0807 17:56:11.776995 574 containerizer.cpp:694] Checkpointing executor's
forked pid 599 to
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f/pids/forked.pid'
I0807 17:56:11.778745 577 slave.cpp:3165] Monitoring executor
'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' of framework
'20150624-232916-16777343-5050-1628-0000' in container
'260e0c23-835a-48a8-ab40-c9566077373f'
I0807 17:56:17.071663 570 http.cpp:331] HTTP request for
'/slave(1)/state.json'
I0807 17:56:49.128911 570 http.cpp:331] HTTP request for
'/slave(1)/state.json'
I0807 17:57:05.966832 577 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562268138521401days
I0807 17:58:05.967597 570 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562268138521401days
I0807 17:58:24.217295 575 containerizer.cpp:1123] Executor for container
'260e0c23-835a-48a8-ab40-c9566077373f' has exited
I0807 17:58:24.217479 575 containerizer.cpp:918] Destroying container
'260e0c23-835a-48a8-ab40-c9566077373f'
I0807 17:58:24.220337 573 slave.cpp:3223] Executor
'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' of framework
20150624-232916-16777343-5050-1628-0000 terminated with signal Killed
I0807 17:58:24.222532 573 slave.cpp:2531] Handling status update TASK_FAILED
(UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 from @0.0.0.0:0
W0807 17:58:24.222796 570 containerizer.cpp:814] Ignoring update for unknown
container: 260e0c23-835a-48a8-ab40-c9566077373f
I0807 17:58:24.223337 573 status_update_manager.cpp:317] Received status
update TASK_FAILED (UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.223754 573 status_update_manager.hpp:346] Checkpointing UPDATE
for status update TASK_FAILED (UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for
task hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.234887 577 slave.cpp:2776] Forwarding the update TASK_FAILED
(UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 to [email protected]:5050
I0807 17:58:24.253701 572 status_update_manager.cpp:389] Received status
update acknowledgement (UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for task
hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.253852 572 status_update_manager.hpp:346] Checkpointing ACK
for status update TASK_FAILED (UUID: ae31ede1-5346-4dda-ba68-9a1053231c68) for
task hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.264957 570 slave.cpp:3332] Cleaning up executor
'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' of framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.265409 572 gc.cpp:56] Scheduling
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f'
for gc 6.99999692936593days in the future
I0807 17:58:24.265508 570 slave.cpp:3411] Cleaning up framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.265611 572 gc.cpp:56] Scheduling
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0'
for gc 6.99999692844741days in the future
I0807 17:58:24.265702 572 gc.cpp:56] Scheduling
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f'
for gc 6.99999692769778days in the future
I0807 17:58:24.265718 573 status_update_manager.cpp:279] Closing status
update streams for framework 20150624-232916-16777343-5050-1628-0000
I0807 17:58:24.265791 572 gc.cpp:56] Scheduling
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0'
for gc 6.99999692723556days in the future
I0807 17:58:24.265909 572 gc.cpp:56] Scheduling
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000'
for gc 6.99999692436444days in the future
I0807 17:58:24.265960 572 gc.cpp:56] Scheduling
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000'
for gc 6.99999692373926days in the future
I0807 17:58:28.115653 577 http.cpp:331] HTTP request for
'/slave(1)/state.json'
I0807 17:58:30.785490 570 slave.cpp:1144] Got assigned task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 for framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:30.786942 570 gc.cpp:84] Unscheduling
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000'
from gc
I0807 17:58:30.787180 570 gc.cpp:84] Unscheduling
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000'
from gc
I0807 17:58:30.787348 570 slave.cpp:1254] Launching task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 for framework
20150624-232916-16777343-5050-1628-0000
I0807 17:58:30.795363 570 slave.cpp:4208] Launching executor
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 in work directory
'/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0/runs/85b55012-196d-47e7-b3ce-18695be37fe5'
I0807 17:58:30.795881 575 containerizer.cpp:484] Starting container
'85b55012-196d-47e7-b3ce-18695be37fe5' for executor
'hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0' of framework
'20150624-232916-16777343-5050-1628-0000'
I0807 17:58:30.796345 570 slave.cpp:1401] Queuing task
'hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0' for executor
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
'20150624-232916-16777343-5050-1628-0000
I0807 17:58:30.797998 575 launcher.cpp:130] Forked child with pid '613' for
container '85b55012-196d-47e7-b3ce-18695be37fe5'
I0807 17:58:30.798468 575 containerizer.cpp:694] Checkpointing executor's
forked pid 613 to
'/tmp/mesos/meta/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-0000/executors/hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0/runs/85b55012-196d-47e7-b3ce-18695be37fe5/pids/forked.pid'
I0807 17:58:30.799487 573 slave.cpp:3165] Monitoring executor
'hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0' of framework
'20150624-232916-16777343-5050-1628-0000' in container
'85b55012-196d-47e7-b3ce-18695be37fe5'
I0807 17:58:39.151962 577 http.cpp:331] HTTP request for
'/slave(1)/state.json'
I0807 17:59:05.968586 572 slave.cpp:3648] Current disk usage 67.68%. Max
allowed age: 1.562210435660521days
I0807 18:00:05.192230 570 slave.cpp:1581] Asked to kill task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 18:00:05.194283 570 slave.cpp:2531] Handling status update TASK_KILLED
(UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 from @0.0.0.0:0
W0807 18:00:05.194577 570 slave.cpp:1694] Killing the unregistered executor
'hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0' of framework
20150624-232916-16777343-5050-1628-0000 because it has no tasks
I0807 18:00:05.194859 577 containerizer.cpp:918] Destroying container
'85b55012-196d-47e7-b3ce-18695be37fe5'
I0807 18:00:05.195132 576 status_update_manager.cpp:317] Received status
update TASK_KILLED (UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 18:00:05.195515 576 status_update_manager.hpp:346] Checkpointing UPDATE
for status update TASK_KILLED (UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for
task hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 18:00:05.203191 577 slave.cpp:2776] Forwarding the update TASK_KILLED
(UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000 to [email protected]:5050
I0807 18:00:05.216285 576 status_update_manager.cpp:389] Received status
update acknowledgement (UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for task
hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 18:00:05.216399 576 status_update_manager.hpp:346] Checkpointing ACK
for status update TASK_KILLED (UUID: b4c4296c-a64c-4a38-8d4a-b39211f71352) for
task hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0 of framework
20150624-232916-16777343-5050-1628-0000
I0807 18:00:05.292657 573 containerizer.cpp:1123] Executor for container
'85b55012-196d-47e7-b3ce-18695be37fe5' has exited
I0807 18:00:05.293400 577 slave.cpp:3223] Executor
'hello-gpu-sleep.edd91e11-3d2d-11e5-935a-fe1614f46ae0' of framework
20150624-232916-16777343-5050-1628-0000 terminated with signal Killed
...
On both salves I see /usr/libexec/mesos mesos-executor executing; however, the
2nd slave does not have any task associated with mesos-executor. Would like to
know the cause of:
I0807 17:58:24.217295 575 containerizer.cpp:1123] Executor for container
'260e0c23-835a-48a8-ab40-c9566077373f' has exited
Any suggestions, as where I should be looking at?
Cheers,
[http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
Nastooh Avessta
ENGINEER.SOFTWARE ENGINEERING
[email protected]
Phone: +1 604 647 1527
Cisco Systems Limited
595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121
VANCOUVER
BRITISH COLUMBIA
V7X 1J1
CA
Cisco.com<http://www.cisco.com/>
[Think before you print.]Think before you print.
This email may contain confidential and privileged material for the sole use of
the intended recipient. Any review, use, distribution or disclosure by others
is strictly prohibited. If you are not the intended recipient (or authorized to
receive for the recipient), please contact the sender by reply email and delete
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3.
Phone: 416-306-7000; Fax: 416-306-7099.
Preferences<http://www.cisco.com/offer/subscribe/?sid=000478326> -
Unsubscribe<http://www.cisco.com/offer/unsubscribe/?sid=000478327> -
Privacy<http://www.cisco.com/web/siteassets/legal/privacy.html>