Vinod Kone created MESOS-976:
--------------------------------
Summary: SlaveRecoveryTest/1.SchedulerFailover is flaky
Key: MESOS-976
URL: https://issues.apache.org/jira/browse/MESOS-976
Project: Mesos
Issue Type: Bug
Components: test
Affects Versions: 0.18.0
Reporter: Vinod Kone
Fix For: 0.18.0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SlaveRecoveryTest/1, where TypeParam =
mesos::internal::slave::CgroupsIsolator
[ RUN ] SlaveRecoveryTest/1.SchedulerFailover
I0206 20:18:31.525116 56447 master.cpp:239] Master ID:
2014-02-06-20:18:31-1740121354-55566-56447 Hostname:
smfd-bkq-03-sr4.devel.twitter.com
I0206 20:18:31.525295 56481 master.cpp:321] Master started on
10.37.184.103:55566
I0206 20:18:31.525315 56481 master.cpp:324] Master only allowing authenticated
frameworks to register!
I0206 20:18:31.527093 56481 master.cpp:756] The newly elected leader is
[email protected]:55566
I0206 20:18:31.527122 56481 master.cpp:764] Elected as the leading master!
I0206 20:18:31.530642 56473 slave.cpp:112] Slave started on
9)@10.37.184.103:55566
I0206 20:18:31.530802 56473 slave.cpp:212] Slave resources: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0206 20:18:31.531203 56473 slave.cpp:240] Slave hostname:
smfd-bkq-03-sr4.devel.twitter.com
I0206 20:18:31.531221 56473 slave.cpp:241] Slave checkpoint: true
I0206 20:18:31.531991 56482 cgroups_isolator.cpp:225] Using
/tmp/mesos_test_cgroup as cgroups hierarchy root
I0206 20:18:31.532470 56478 state.cpp:33] Recovering state from
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta'
I0206 20:18:31.532698 56469 status_update_manager.cpp:188] Recovering status
update manager
I0206 20:18:31.533962 56472 sched.cpp:265] Authenticating with master
[email protected]:55566
I0206 20:18:31.534102 56472 sched.cpp:234] Detecting new master
I0206 20:18:31.534124 56484 authenticatee.hpp:124] Creating new client SASL
connection
I0206 20:18:31.534299 56473 master.cpp:2317] Authenticating framework at
scheduler(9)@10.37.184.103:55566
I0206 20:18:31.534459 56461 authenticator.hpp:140] Creating new server SASL
connection
I0206 20:18:31.534572 56466 authenticatee.hpp:212] Received SASL authentication
mechanisms: CRAM-MD5
I0206 20:18:31.534595 56466 authenticatee.hpp:238] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0206 20:18:31.534667 56474 authenticator.hpp:243] Received SASL authentication
start
I0206 20:18:31.534732 56474 authenticator.hpp:325] Authentication requires more
steps
I0206 20:18:31.534814 56468 authenticatee.hpp:258] Received SASL authentication
step
I0206 20:18:31.534946 56466 authenticator.hpp:271] Received SASL authentication
step
I0206 20:18:31.535007 56466 authenticator.hpp:317] Authentication success
I0206 20:18:31.535084 56471 authenticatee.hpp:298] Authentication success
I0206 20:18:31.535107 56461 master.cpp:2357] Successfully authenticated
framework at scheduler(9)@10.37.184.103:55566
I0206 20:18:31.535392 56476 sched.cpp:339] Successfully authenticated with
master [email protected]:55566
I0206 20:18:31.535512 56465 master.cpp:812] Received registration request from
scheduler(9)@10.37.184.103:55566
I0206 20:18:31.535570 56465 master.cpp:830] Registering framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 at
scheduler(9)@10.37.184.103:55566
I0206 20:18:31.535856 56465 hierarchical_allocator_process.hpp:332] Added
framework 2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.537802 56482 cgroups_isolator.cpp:840] Recovering isolator
I0206 20:18:31.538462 56472 slave.cpp:2760] Finished recovery
I0206 20:18:31.538910 56472 slave.cpp:508] New master detected at
[email protected]:55566
I0206 20:18:31.539036 56478 status_update_manager.cpp:162] New master detected
at [email protected]:55566
I0206 20:18:31.539223 56464 master.cpp:1834] Attempting to register slave on
smfd-bkq-03-sr4.devel.twitter.com at slave(9)@10.37.184.103:55566
I0206 20:18:31.539271 56472 slave.cpp:533] Detecting new master
I0206 20:18:31.539330 56464 master.cpp:2804] Adding slave
2014-02-06-20:18:31-1740121354-55566-56447-0 at
smfd-bkq-03-sr4.devel.twitter.com with cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000]
I0206 20:18:31.539454 56472 slave.cpp:551] Registered with master
[email protected]:55566; given slave ID
2014-02-06-20:18:31-1740121354-55566-56447-0
I0206 20:18:31.539620 56472 slave.cpp:564] Checkpointing SlaveInfo to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/slave.info'
I0206 20:18:31.539834 56475 hierarchical_allocator_process.hpp:445] Added slave
2014-02-06-20:18:31-1740121354-55566-56447-0
(smfd-bkq-03-sr4.devel.twitter.com) with cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000] available)
I0206 20:18:31.540341 56472 master.cpp:2272] Sending 1 offers to framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.543433 56472 master.cpp:1568] Processing reply for offers: [
2014-02-06-20:18:31-1740121354-55566-56447-0 ] on slave
2014-02-06-20:18:31-1740121354-55566-56447-0
(smfd-bkq-03-sr4.devel.twitter.com) for framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.543642 56472 master.hpp:411] Adding task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] on slave
2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
I0206 20:18:31.543781 56472 master.cpp:2441] Launching task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
I0206 20:18:31.544002 56484 slave.cpp:736] Got assigned task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.544097 56484 slave.cpp:2899] Checkpointing FrameworkInfo to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.info'
I0206 20:18:31.544272 56484 slave.cpp:2906] Checkpointing framework pid
'scheduler(9)@10.37.184.103:55566' to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid'
I0206 20:18:31.544617 56484 slave.cpp:845] Launching task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.546721 56484 slave.cpp:3169] Checkpointing ExecutorInfo to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/executor.info'
I0206 20:18:31.547317 56484 slave.cpp:3257] Checkpointing TaskInfo to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/tasks/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/task.info'
I0206 20:18:31.547514 56484 slave.cpp:955] Queuing task
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' for executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
'2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.547590 56481 cgroups_isolator.cpp:517] Launching
d045a0bd-2ed2-410a-bd1f-5bd9219896e3
(/home/vinod/mesos/build/src/mesos-executor) in
/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986
with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 in cgroup
mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
I0206 20:18:31.548408 56481 cgroups_isolator.cpp:717] Changing cgroup controls
for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0206 20:18:31.548833 56481 cgroups_isolator.cpp:1007] Updated 'cpu.shares' to
2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.549294 56481 cgroups_isolator.cpp:1117] Updated
'memory.soft_limit_in_bytes' to 1GB for executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.550107 56481 cgroups_isolator.cpp:1147] Updated
'memory.limit_in_bytes' to 1GB for executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.550571 56481 cgroups_isolator.cpp:1174] Started listening for
OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.551553 56481 cgroups_isolator.cpp:569] Forked executor at = 56671
Checkpointing executor's forked pid 56671 to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/forked.pid'
I0206 20:18:31.552222 56472 slave.cpp:2098] Monitoring executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 forked at pid 56671
Fetching resources into
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986'
I0206 20:18:31.604012 56472 slave.cpp:1431] Got registration for executor
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.604167 56472 slave.cpp:1516] Checkpointing executor pid
'executor(1)@10.37.184.103:46181' to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/executors/d045a0bd-2ed2-410a-bd1f-5bd9219896e3/runs/9adabe16-5d84-45c9-bc83-1a72a6d1c986/pids/libprocess.pid'
I0206 20:18:31.605183 56472 slave.cpp:1552] Flushing queued task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 for executor
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
Registered executor on smfd-bkq-03-sr4.devel.twitter.com
Starting task d045a0bd-2ed2-410a-bd1f-5bd9219896e3
sh -c 'sleep 1000'
Forked command at 56712
I0206 20:18:31.613098 56481 slave.cpp:1765] Handling status update TASK_RUNNING
(UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 from
executor(1)@10.37.184.103:46181
I0206 20:18:31.613628 56469 status_update_manager.cpp:314] Received status
update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.614006 56469 status_update_manager.hpp:342] Checkpointing UPDATE
for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for
task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.795529 56469 status_update_manager.cpp:367] Forwarding status
update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 to [email protected]:55566
I0206 20:18:31.795992 56480 slave.cpp:1890] Sending acknowledgement for status
update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 to
executor(1)@10.37.184.103:46181
I0206 20:18:31.796131 56471 master.cpp:2020] Status update TASK_RUNNING (UUID:
fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 from
slave(9)@10.37.184.103:55566
I0206 20:18:31.797099 56483 status_update_manager.cpp:392] Received status
update acknowledgement (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.797165 56483 status_update_manager.hpp:342] Checkpointing ACK
for status update TASK_RUNNING (UUID: fc151a46-751b-4c4b-b048-1727752f34e3) for
task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.882767 56481 slave.cpp:394] Slave terminating
I0206 20:18:31.883112 56481 master.cpp:641] Slave
2014-02-06-20:18:31-1740121354-55566-56447-0
(smfd-bkq-03-sr4.devel.twitter.com) disconnected
I0206 20:18:31.883200 56476 hierarchical_allocator_process.hpp:484] Slave
2014-02-06-20:18:31-1740121354-55566-56447-0 disconnected
I0206 20:18:31.888206 56473 sched.cpp:265] Authenticating with master
[email protected]:55566
I0206 20:18:31.888473 56473 sched.cpp:234] Detecting new master
I0206 20:18:31.888556 56469 authenticatee.hpp:124] Creating new client SASL
connection
I0206 20:18:31.888978 56484 master.cpp:2317] Authenticating framework at
scheduler(10)@10.37.184.103:55566
I0206 20:18:31.889348 56469 authenticator.hpp:140] Creating new server SASL
connection
I0206 20:18:31.889925 56469 authenticatee.hpp:212] Received SASL authentication
mechanisms: CRAM-MD5
I0206 20:18:31.889989 56469 authenticatee.hpp:238] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0206 20:18:31.890059 56469 authenticator.hpp:243] Received SASL authentication
start
I0206 20:18:31.890233 56469 authenticator.hpp:325] Authentication requires more
steps
I0206 20:18:31.890399 56468 authenticatee.hpp:258] Received SASL authentication
step
I0206 20:18:31.890554 56484 authenticator.hpp:271] Received SASL authentication
step
I0206 20:18:31.890630 56484 authenticator.hpp:317] Authentication success
I0206 20:18:31.890728 56470 authenticatee.hpp:298] Authentication success
I0206 20:18:31.890748 56484 master.cpp:2357] Successfully authenticated
framework at scheduler(10)@10.37.184.103:55566
I0206 20:18:31.892210 56469 sched.cpp:339] Successfully authenticated with
master [email protected]:55566
I0206 20:18:31.892410 56473 master.cpp:900] Re-registering framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 at
scheduler(10)@10.37.184.103:55566
I0206 20:18:31.892460 56473 master.cpp:926] Framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 failed over
W0206 20:18:31.892691 56465 master.cpp:1048] Ignoring deactivate framework
message for framework 2014-02-06-20:18:31-1740121354-55566-56447-0000 from
'scheduler(9)@10.37.184.103:55566' because it is not from the registered
framework 'scheduler(10)@10.37.184.103:55566'
I0206 20:18:31.897049 56466 slave.cpp:112] Slave started on
10)@10.37.184.103:55566
I0206 20:18:31.897207 56466 slave.cpp:212] Slave resources: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0206 20:18:31.897536 56466 slave.cpp:240] Slave hostname:
smfd-bkq-03-sr4.devel.twitter.com
I0206 20:18:31.897554 56466 slave.cpp:241] Slave checkpoint: true
I0206 20:18:31.898388 56463 cgroups_isolator.cpp:225] Using
/tmp/mesos_test_cgroup as cgroups hierarchy root
I0206 20:18:31.898936 56472 state.cpp:33] Recovering state from
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta'
I0206 20:18:31.901702 56465 slave.cpp:2828] Recovering framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.901759 56465 slave.cpp:3020] Recovering executor
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:31.902716 56464 status_update_manager.cpp:188] Recovering status
update manager
I0206 20:18:31.902884 56464 status_update_manager.cpp:196] Recovering executor
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.475915 56463 cgroups_isolator.cpp:840] Recovering isolator
I0206 20:18:34.476066 56463 cgroups_isolator.cpp:847] Recovering executor
'd045a0bd-2ed2-410a-bd1f-5bd9219896e3' of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.477478 56463 cgroups_isolator.cpp:1174] Started listening for
OOM events for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.478728 56463 slave.cpp:2700] Sending reconnect request to
executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 at
executor(1)@10.37.184.103:46181
I0206 20:18:34.480114 56476 slave.cpp:1597] Re-registering executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.480566 56476 cgroups_isolator.cpp:717] Changing cgroup controls
for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0206 20:18:34.481370 56476 cgroups_isolator.cpp:1007] Updated 'cpu.shares' to
2048 for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.481827 56476 cgroups_isolator.cpp:1117] Updated
'memory.soft_limit_in_bytes' to 1GB for executor
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
Re-registered executor on smfd-bkq-03-sr4.devel.twitter.com
I0206 20:18:34.489497 56471 slave.cpp:1713] Cleaning up un-reregistered
executors
I0206 20:18:34.489588 56471 slave.cpp:2760] Finished recovery
I0206 20:18:34.490048 56463 slave.cpp:508] New master detected at
[email protected]:55566
I0206 20:18:34.490257 56475 status_update_manager.cpp:162] New master detected
at [email protected]:55566
I0206 20:18:34.490357 56463 slave.cpp:533] Detecting new master
W0206 20:18:34.490603 56480 master.cpp:1878] Slave at
slave(10)@10.37.184.103:55566 (smfd-bkq-03-sr4.devel.twitter.com) is being
allowed to re-register with an already in use id
(2014-02-06-20:18:31-1740121354-55566-56447-0)
I0206 20:18:34.490927 56479 slave.cpp:601] Re-registered with master
[email protected]:55566
I0206 20:18:34.491322 56461 hierarchical_allocator_process.hpp:498] Slave
2014-02-06-20:18:31-1740121354-55566-56447-0 reconnected
I0206 20:18:34.491421 56468 slave.cpp:1312] Updating framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 pid to
scheduler(10)@10.37.184.103:55566
I0206 20:18:34.491444 56480 master.cpp:1673] Asked to kill task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.491488 56468 slave.cpp:1320] Checkpointing framework pid
'scheduler(10)@10.37.184.103:55566' to
'/tmp/SlaveRecoveryTest_1_SchedulerFailover_7dC2N1/meta/slaves/2014-02-06-20:18:31-1740121354-55566-56447-0/frameworks/2014-02-06-20:18:31-1740121354-55566-56447-0000/framework.pid'
I0206 20:18:34.491497 56480 master.cpp:1707] Telling slave
2014-02-06-20:18:31-1740121354-55566-56447-0
(smfd-bkq-03-sr4.devel.twitter.com) to kill task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.491657 56468 slave.cpp:1013] Asked to kill task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
Shutting down
Killing process tree at pid 56712
Killed the following process trees:
[
--- 56712 sleep 1000
]
Command terminated with signal Killed (pid: 56712)
I0206 20:18:34.615216 56463 slave.cpp:1765] Handling status update TASK_KILLED
(UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 from
executor(1)@10.37.184.103:46181
I0206 20:18:34.615556 56483 cgroups_isolator.cpp:717] Changing cgroup controls
for executor d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 with resources
I0206 20:18:34.615624 56476 status_update_manager.cpp:314] Received status
update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.615701 56476 status_update_manager.hpp:342] Checkpointing UPDATE
for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for
task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.706945 56476 status_update_manager.cpp:367] Forwarding status
update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 to [email protected]:55566
I0206 20:18:34.707263 56476 slave.cpp:1890] Sending acknowledgement for status
update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 to
executor(1)@10.37.184.103:46181
I0206 20:18:34.707352 56469 master.cpp:2020] Status update TASK_KILLED (UUID:
d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000 from
slave(10)@10.37.184.103:55566
I0206 20:18:34.707620 56469 master.hpp:429] Removing task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 with resources cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] on slave
2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
I0206 20:18:34.708348 56466 hierarchical_allocator_process.hpp:637] Recovered
cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total
allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on
slave 2014-02-06-20:18:31-1740121354-55566-56447-0 from framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.708673 56469 status_update_manager.cpp:392] Received status
update acknowledgement (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for task
d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.708749 56469 status_update_manager.hpp:342] Checkpointing ACK
for status update TASK_KILLED (UUID: d9d37827-3002-4a67-8659-fa36f1986fc7) for
task d045a0bd-2ed2-410a-bd1f-5bd9219896e3 of framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.709411 56470 master.cpp:2272] Sending 1 offers to framework
2014-02-06-20:18:31-1740121354-55566-56447-0000
I0206 20:18:34.809782 56447 master.cpp:583] Master terminating
I0206 20:18:34.810066 56447 master.cpp:246] Shutting down master
I0206 20:18:34.810134 56482 slave.cpp:1965] [email protected]:55566 exited
W0206 20:18:34.810184 56482 slave.cpp:1968] Master disconnected! Waiting for a
new master to be elected
I0206 20:18:34.810652 56447 master.cpp:289] Removing slave
2014-02-06-20:18:31-1740121354-55566-56447-0 (smfd-bkq-03-sr4.devel.twitter.com)
I0206 20:18:34.813144 56447 slave.cpp:394] Slave terminating
I0206 20:18:34.821583 56467 cgroups.cpp:1209] Trying to freeze cgroup
/tmp/mesos_test_cgroup/mesos_test
I0206 20:18:34.821652 56467 cgroups.cpp:1248] Successfully froze cgroup
/tmp/mesos_test_cgroup/mesos_test after 1 attempts
I0206 20:18:34.823129 56471 cgroups.cpp:1224] Trying to thaw cgroup
/tmp/mesos_test_cgroup/mesos_test
I0206 20:18:34.823247 56471 cgroups.cpp:1334] Successfully thawed
/tmp/mesos_test_cgroup/mesos_test
I0206 20:18:34.923945 56470 cgroups.cpp:1209] Trying to freeze cgroup
/tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
I0206 20:18:34.924018 56470 cgroups.cpp:1248] Successfully froze cgroup
/tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
after 1 attempts
I0206 20:18:34.925506 56461 cgroups.cpp:1224] Trying to thaw cgroup
/tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
I0206 20:18:34.925580 56461 cgroups.cpp:1334] Successfully thawed
/tmp/mesos_test_cgroup/mesos_test/framework_2014-02-06-20:18:31-1740121354-55566-56447-0000_executor_d045a0bd-2ed2-410a-bd1f-5bd9219896e3_tag_9adabe16-5d84-45c9-bc83-1a72a6d1c986
[ OK ] SlaveRecoveryTest/1.SchedulerFailover (3408 ms)
[----------] 1 test from SlaveRecoveryTest/1 (3409 ms total)
[----------] Global test environment tear-down
../../src/tests/environment.cpp:247: Failure
Failed
Tests completed with child processes remaining:
-+- 56447 /home/vinod/mesos/build/src/.libs/lt-mesos-tests --verbose
--gtest_filter=*SlaveRecoveryTest/1.SchedulerFailover* --gtest_repeat=10
\--- 56671 ()
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)