See <https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Set-JAVA_HOME/319/changes>
Changes: [vinodkone] Added checkpoint field to slave info and fixed master to remove slaves that have disabled checkpointing. Review: https://reviews.apache.org/r/9927 [benh] Explicitly disabled tests with the CgroupsIsolationModule type parameter. Review: https://reviews.apache.org/r/9920 [benh] Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun. https://reviews.apache.org/r/9922 [vinodkone] Fixed the master to send updated framework pids and relink to the reregistered slave. Review: https://reviews.apache.org/r/9918 ------------------------------------------ [...truncated 6454 lines...] I0315 01:16:54.200036 24473 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:54.216889 24473 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 16.88ms I0315 01:16:54.350396 24470 slave.cpp:390] Slave terminating I0315 01:16:54.350605 24473 master.cpp:521] Slave 201303150116-1015726915-49340-24450-0(janus.apache.org) disconnected I0315 01:16:54.351173 24779 process.cpp:878] Socket closed while receiving I0315 01:16:54.351202 24771 exec.cpp:321] Executor asked to shutdown I0315 01:16:54.351323 24774 exec.cpp:75] Scheduling shutdown of the executor Waited on process 24780, returned status 15 I0315 01:16:54.351536 24773 exec.cpp:382] Executor sending status update for task 2cda50a5-44f1-4282-aee1-c9af21457b60 in state TASK_FAILED I0315 01:16:54.351557 24469 slave.cpp:202] Slave started on 20)@67.195.138.60:49340 I0315 01:16:54.351595 24469 slave.cpp:203] Slave resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 I0315 01:16:54.352246 24469 state.cpp:33] Recovering state from /tmp/SlaveRecoveryTest_0_RecoverTerminatedExecutor_neaSYz/meta I0315 01:16:54.351651 24470 master.cpp:576] Still acting as master! I0315 01:16:54.353718 24472 status_update_manager.cpp:153] Recovering status update manager I0315 01:16:54.353865 24472 status_update_manager.cpp:157] Recovering executor '2cda50a5-44f1-4282-aee1-c9af21457b60' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.354353 24472 status_update_manager.cpp:402] Creating StatusUpdate stream for task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.354193 24469 slave.cpp:457] New master detected at [email protected]:49340 I0315 01:16:54.354882 24472 status_update_manager.hpp:233] Replaying status update stream for task 2cda50a5-44f1-4282-aee1-c9af21457b60 I0315 01:16:54.355752 24472 status_update_manager.hpp:314] Handling UPDATE for status update TASK_RUNNING from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.356232 24472 status_update_manager.hpp:314] Handling ACK for status update TASK_RUNNING from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.356930 24473 process_based_isolation_module.cpp:300] Recovering isolation module I0315 01:16:54.357249 24473 process_based_isolation_module.cpp:308] Recovering executor '2cda50a5-44f1-4282-aee1-c9af21457b60' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.356950 24472 status_update_manager.cpp:131] New master detected at [email protected]:49340 I0315 01:16:54.358147 24471 slave.cpp:1758] Recovering executors I0315 01:16:54.358636 24471 slave.cpp:1762] Recovering executor '2cda50a5-44f1-4282-aee1-c9af21457b60' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:54.359338 24471 slave.cpp:1844] Sending reconnect request to executor 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 at executor(1)@67.195.138.60:47148 I0315 01:16:54.360571 24779 process.cpp:878] Socket closed while receiving I0315 01:16:54.360635 24777 exec.cpp:216] Ignoring reconnect message from slave 201303150116-1015726915-49340-24450-0 because the driver is aborted! I0315 01:16:54.360599 24471 slave.cpp:440] Successfully attached file '/tmp/SlaveRecoveryTest_0_RecoverTerminatedExecutor_neaSYz/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/2cda50a5-44f1-4282-aee1-c9af21457b60/runs/674f6360-d7ee-4756-8e21-ad1928cf8302' I0315 01:16:55.217609 24472 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:55.225368 24472 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 7.79ms I0315 01:16:55.353036 24477 process.cpp:878] Socket closed while receiving I0315 01:16:56.227113 24474 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:56.233194 24474 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 6.12ms I0315 01:16:56.352308 24474 process_based_isolation_module.cpp:416] Telling slave of lost executor 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.352409 24470 slave.cpp:1434] Executor '2cda50a5-44f1-4282-aee1-c9af21457b60' of framework 201303150116-1015726915-49340-24450-0000 has exited with status 0 I0315 01:16:56.355842 24470 slave.cpp:1188] Handling status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.355995 24470 slave.cpp:1235] Forwarding status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 to the status update manager I0315 01:16:56.356487 24469 status_update_manager.cpp:253] Received status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.356577 24471 gc.cpp:97] Scheduling /tmp/SlaveRecoveryTest_0_RecoverTerminatedExecutor_neaSYz/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/2cda50a5-44f1-4282-aee1-c9af21457b60/runs/674f6360-d7ee-4756-8e21-ad1928cf8302 for removal I0315 01:16:56.356724 24469 status_update_manager.hpp:283] Checkpointing UPDATE for status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.361495 24475 slave.cpp:1157] Cleaning up un-reregistered executors I0315 01:16:56.361985 24475 slave.cpp:381] Finished recovery W0315 01:16:56.362040 24473 master.cpp:983] Slave at slave(20)@67.195.138.60:49340 (janus.apache.org) is being allowed to re-register with an already in use id (201303150116-1015726915-49340-24450-0) I0315 01:16:56.374301 24472 slave.cpp:536] Re-registered with master I0315 01:16:56.352390 24474 process_utils.hpp:64] Stopping ... 24752 Sent signal to 24752 I0315 01:16:56.434831 24469 status_update_manager.hpp:314] Handling UPDATE for status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.434895 24469 status_update_manager.cpp:288] Forwarding status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 to the master at [email protected]:49340 I0315 01:16:56.435355 24473 master.cpp:1054] Status update from (125)@67.195.138.60:49340: task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_FAILED I0315 01:16:56.435739 24470 sched.cpp:327] Status update: task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_FAILED I0315 01:16:56.436223 24470 slave.cpp:964] Got acknowledgement of status update for task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.436697 24470 status_update_manager.cpp:313] Received status update acknowledgement for task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.437082 24470 status_update_manager.hpp:283] Checkpointing ACK for status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.435747 24473 master.hpp:296] Removing task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:56.436303 24450 sched.cpp:422] Stopping framework '201303150116-1015726915-49340-24450-0000' I0315 01:16:56.438206 24475 hierarchical_allocator_process.hpp:542] Recovered cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; mem=1024; ports=[31000-32000]; disk=1024) on slave 201303150116-1015726915-49340-24450-0 from framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.446329 24450 slave.cpp:428] Slave asked to shut down by @0.0.0.0:0 I0315 01:16:56.446357 24473 master.cpp:742] Asked to unregister framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.477685 24450 slave.cpp:390] Slave terminating I0315 01:16:56.498335 24473 hierarchical_allocator_process.hpp:357] Deactivated framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.506286 24472 master.cpp:521] Slave 201303150116-1015726915-49340-24450-0(janus.apache.org) disconnected I0315 01:16:56.506320 24473 hierarchical_allocator_process.hpp:310] Removed framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.559798 24470 status_update_manager.hpp:314] Handling ACK for status update TASK_FAILED from task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.560212 24470 status_update_manager.cpp:433] Cleaning up status update stream for task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.560627 24470 status_update_manager.hpp:257] Deleting the meta directory for task 2cda50a5-44f1-4282-aee1-c9af21457b60 of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.561700 24450 master.cpp:477] Master terminating I0315 01:16:56.561743 24450 master.cpp:283] Shutting down master I0315 01:16:56.562260 24469 hierarchical_allocator_process.hpp:421] Removed slave 201303150116-1015726915-49340-24450-0 [ OK ] SlaveRecoveryTest/0.RecoverTerminatedExecutor (3369 ms) [ RUN ] SlaveRecoveryTest/0.CleanupExecutor I0315 01:16:56.564337 24471 master.cpp:309] Master started on 67.195.138.60:49340 I0315 01:16:56.564385 24471 master.cpp:324] Master ID: 201303150116-1015726915-49340-24450 I0315 01:16:56.564411 24474 slave.cpp:202] Slave started on 21)@67.195.138.60:49340 I0315 01:16:56.565208 24474 slave.cpp:203] Slave resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 I0315 01:16:56.564836 24476 hierarchical_allocator_process.hpp:234] Initializing hierarchical allocator process with master : [email protected]:49340 W0315 01:16:56.564865 24475 master.cpp:79] No whitelist given. Advertising offers for all slaves I0315 01:16:56.565002 24471 master.cpp:571] Elected as master! I0315 01:16:56.564456 24472 sched.cpp:182] New master at [email protected]:49340 I0315 01:16:56.566009 24469 process_based_isolation_module.cpp:300] Recovering isolation module I0315 01:16:56.566032 24474 slave.cpp:457] New master detected at [email protected]:49340 I0315 01:16:56.568629 24474 slave.cpp:381] Finished recovery I0315 01:16:56.567739 24472 master.cpp:614] Registering framework 201303150116-1015726915-49340-24450-0000 at scheduler(17)@67.195.138.60:49340 I0315 01:16:56.568645 24469 status_update_manager.cpp:131] New master detected at [email protected]:49340 I0315 01:16:56.570289 24475 sched.cpp:217] Framework registered with 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.570315 24472 master.cpp:936] Attempting to register slave on janus.apache.org at slave(21)@67.195.138.60:49340 I0315 01:16:56.571666 24472 master.cpp:1191] Master now considering a slave at janus.apache.org:49340 as active I0315 01:16:56.572165 24472 master.cpp:1767] Adding slave 201303150116-1015726915-49340-24450-0 at janus.apache.org with cpus=2; mem=1024; ports=[31000-32000]; disk=1024 I0315 01:16:56.570328 24473 hierarchical_allocator_process.hpp:266] Added framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.573120 24473 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:56.573616 24473 hierarchical_allocator_process.hpp:597] Performed allocation for 0 slaves in 496.67us I0315 01:16:56.574131 24473 hierarchical_allocator_process.hpp:393] Added slave 201303150116-1015726915-49340-24450-0 (janus.apache.org) with cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (and cpus=2; mem=1024; ports=[31000-32000]; disk=1024 available) I0315 01:16:56.574609 24473 hierarchical_allocator_process.hpp:658] Found available resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:56.572718 24471 slave.cpp:491] Registered with master; given slave ID 201303150116-1015726915-49340-24450-0 I0315 01:16:56.575618 24471 paths.hpp:335] Created slave directory '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0' Checkpointing SlaveInfo to '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/slave.info' I0315 01:16:56.575112 24473 hierarchical_allocator_process.hpp:684] Offering cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 to framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.576800 24473 hierarchical_allocator_process.hpp:617] Performed allocation for slave 201303150116-1015726915-49340-24450-0 in 2.20ms I0315 01:16:56.576830 24474 master.hpp:305] Adding offer with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:56.578352 24474 master.cpp:1294] Sending 1 offers to framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.578840 24471 sched.cpp:282] Received 1 offers I0315 01:16:56.580510 24469 master.cpp:1501] Processing reply for offer 201303150116-1015726915-49340-24450-0 on slave 201303150116-1015726915-49340-24450-0 (janus.apache.org) for framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.580698 24469 master.hpp:285] Adding task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:56.580951 24469 master.cpp:1618] Launching task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 (janus.apache.org) I0315 01:16:56.581488 24476 slave.cpp:603] Got assigned task a3565c16-11ab-4845-90aa-594265a25e9b for framework 201303150116-1015726915-49340-24450-0000 Checkpointing FrameworkInfo to '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/framework.info' Checkpointing 'scheduler(17)@67.195.138.60:49340' to /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/framework.pid I0315 01:16:56.583545 24476 paths.hpp:302] Created executor directory '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791' Checkpointing ExecutorInfo to '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/executor.info' I0315 01:16:56.581521 24469 master.hpp:314] Removing offer with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:56.583973 24476 paths.hpp:302] Created executor directory '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791' Checkpointing Task to '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791/tasks/a3565c16-11ab-4845-90aa-594265a25e9b/task.info' I0315 01:16:56.584699 24476 slave.cpp:440] Successfully attached file '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791' I0315 01:16:56.584704 24471 process_based_isolation_module.cpp:123] Launching a3565c16-11ab-4845-90aa-594265a25e9b (<https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Set-JAVA_HOME/ws/build/src/mesos-executor)> in /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791 with resources ' for framework 201303150116-1015726915-49340-24450-0000 Checkpointing forked pid 24822 Checkpointing '24822' to /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791/pids/forked.pid I0315 01:16:56.587703 24471 process_based_isolation_module.cpp:162] Forked executor at 24822 Fetching resources into /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791 WARNING: Logging before InitGoogleLogging() is written to STDERR I0315 01:16:56.630276 24827 process.cpp:1419] libprocess is initialized on 67.195.138.60:59502 for 8 cpus I0315 01:16:56.631948 24841 exec.cpp:170] Executor started at: executor(1)@67.195.138.60:59502 with pid 24827 I0315 01:16:56.632731 24474 slave.cpp:1005] Got registration for executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 Checkpointing 'executor(1)@67.195.138.60:59502' to /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791/pids/libprocess.pid I0315 01:16:56.633297 24474 slave.cpp:1080] Flushing queued tasks for framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.633779 24849 process.cpp:878] Socket closed while receiving I0315 01:16:56.633853 24845 exec.cpp:194] Executor registered on slave 201303150116-1015726915-49340-24450-0 Registered executor on janus.apache.org I0315 01:16:56.633966 24849 process.cpp:878] Socket closed while receiving I0315 01:16:56.634078 24846 exec.cpp:258] Executor asked to run task 'a3565c16-11ab-4845-90aa-594265a25e9b' Starting task a3565c16-11ab-4845-90aa-594265a25e9b sh -c 'sleep 1000' I0315 01:16:56.634605 24846 exec.cpp:382] Executor sending status update for task a3565c16-11ab-4845-90aa-594265a25e9b in state TASK_RUNNING I0315 01:16:56.635917 24473 slave.cpp:1188] Handling status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.635968 24473 slave.cpp:1235] Forwarding status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 to the status update manager I0315 01:16:56.636538 24473 status_update_manager.cpp:253] Received status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.636780 24473 status_update_manager.cpp:402] Creating StatusUpdate stream for task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.637352 24473 status_update_manager.hpp:283] Checkpointing UPDATE for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.735941 24473 status_update_manager.hpp:314] Handling UPDATE for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.735988 24473 status_update_manager.cpp:288] Forwarding status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 to the master at [email protected]:49340 I0315 01:16:56.736498 24475 master.cpp:1054] Status update from (128)@67.195.138.60:49340: task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_RUNNING I0315 01:16:56.736624 24470 slave.cpp:1297] Sending ACK for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 to executor executor(1)@67.195.138.60:59502 I0315 01:16:56.737532 24849 process.cpp:878] Socket closed while receiving I0315 01:16:56.737570 24847 exec.cpp:289] Executor received ACK for status update of task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.736889 24472 sched.cpp:327] Status update: task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_RUNNING I0315 01:16:56.737825 24472 slave.cpp:964] Got acknowledgement of status update for task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.738265 24472 status_update_manager.cpp:313] Received status update acknowledgement for task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.738669 24472 status_update_manager.hpp:283] Checkpointing ACK for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:56.802181 24472 status_update_manager.hpp:314] Handling ACK for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.566555 24473 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:57.566602 24473 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 83.69us I0315 01:16:57.737936 24476 slave.cpp:390] Slave terminating I0315 01:16:57.738162 24472 master.cpp:521] Slave 201303150116-1015726915-49340-24450-0(janus.apache.org) disconnected I0315 01:16:57.739027 24473 slave.cpp:202] Slave started on 22)@67.195.138.60:49340 I0315 01:16:57.739063 24473 slave.cpp:203] Slave resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 I0315 01:16:57.739123 24469 master.cpp:576] Still acting as master! I0315 01:16:57.739748 24473 state.cpp:33] Recovering state from /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta I0315 01:16:57.741600 24476 status_update_manager.cpp:153] Recovering status update manager I0315 01:16:57.741646 24476 status_update_manager.cpp:157] Recovering executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.742023 24473 slave.cpp:457] New master detected at [email protected]:49340 I0315 01:16:57.742072 24476 status_update_manager.cpp:402] Creating StatusUpdate stream for task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.743021 24476 status_update_manager.hpp:233] Replaying status update stream for task a3565c16-11ab-4845-90aa-594265a25e9b I0315 01:16:57.743916 24476 status_update_manager.hpp:314] Handling UPDATE for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.742475 24473 slave.cpp:472] Skipping registration because slave is started in 'cleanup' mode I0315 01:16:57.744335 24476 status_update_manager.hpp:314] Handling ACK for status update TASK_RUNNING from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.745476 24470 process_based_isolation_module.cpp:300] Recovering isolation module I0315 01:16:57.745823 24470 process_based_isolation_module.cpp:308] Recovering executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.745486 24476 status_update_manager.cpp:131] New master detected at [email protected]:49340 I0315 01:16:57.746608 24473 slave.cpp:1758] Recovering executors I0315 01:16:57.747170 24473 slave.cpp:1762] Recovering executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.748567 24473 slave.cpp:1853] Sending shutdown to executor a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 at executor(1)@67.195.138.60:59502 I0315 01:16:57.748816 24473 slave.cpp:1609] Shutting down executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:57.749480 24849 process.cpp:878] Socket closed while receiving I0315 01:16:57.749512 24848 exec.cpp:321] Executor asked to shutdown I0315 01:16:57.749603 24848 exec.cpp:75] Scheduling shutdown of the executor Waited on process 24850, returned status 15 I0315 01:16:57.749801 24843 exec.cpp:382] Executor sending status update for task a3565c16-11ab-4845-90aa-594265a25e9b in state TASK_FAILED I0315 01:16:57.750157 24473 slave.cpp:440] Successfully attached file '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791' I0315 01:16:57.750205 24473 slave.cpp:381] Finished recovery I0315 01:16:58.567729 24472 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:58.573717 24472 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 6.01ms I0315 01:16:58.751245 24477 process.cpp:878] Socket closed while receiving I0315 01:16:59.575033 24474 hierarchical_allocator_process.hpp:666] No resources available to allocate! I0315 01:16:59.581568 24474 hierarchical_allocator_process.hpp:597] Performed allocation for 1 slaves in 6.57ms I0315 01:16:59.740150 24470 process_based_isolation_module.cpp:416] Telling slave of lost executor a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.742136 24469 slave.cpp:1434] Executor 'a3565c16-11ab-4845-90aa-594265a25e9b' of framework 201303150116-1015726915-49340-24450-0000 has exited with status 0 I0315 01:16:59.743664 24469 slave.cpp:1188] Handling status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.743789 24469 slave.cpp:1235] Forwarding status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 to the status update manager I0315 01:16:59.744189 24472 status_update_manager.cpp:253] Received status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.744523 24472 status_update_manager.hpp:283] Checkpointing UPDATE for status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.744297 24475 gc.cpp:97] Scheduling /tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/slaves/201303150116-1015726915-49340-24450-0/frameworks/201303150116-1015726915-49340-24450-0000/executors/a3565c16-11ab-4845-90aa-594265a25e9b/runs/7bbf426e-352b-4458-a622-3641954f9791 for removal I0315 01:16:59.744261 24469 slave.cpp:1558] Slave is shutting down because it is started with --recover==cleanup and all executors have terminated! I0315 01:16:59.767133 24469 slave.cpp:1565] Archiving and deleting the meta directory '/tmp/SlaveRecoveryTest_0_CleanupExecutor_POtFMh/meta' to allow incompatible upgrade! tar: Removing leading `/' from member names I0315 01:16:59.777994 24469 slave.cpp:428] Slave asked to shut down by @0.0.0.0:0 I0315 01:16:59.778100 24469 slave.cpp:390] Slave terminating I0315 01:16:59.740216 24470 process_utils.hpp:64] Stopping ... 24822 Sent signal to 24822 I0315 01:16:59.831182 24472 status_update_manager.hpp:314] Handling UPDATE for status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.831274 24472 status_update_manager.cpp:288] Forwarding status update TASK_FAILED from task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 to the master at [email protected]:49340 I0315 01:16:59.831959 24473 master.cpp:1054] Status update from (132)@67.195.138.60:49340: task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_FAILED I0315 01:16:59.832192 24476 sched.cpp:327] Status update: task a3565c16-11ab-4845-90aa-594265a25e9b of framework 201303150116-1015726915-49340-24450-0000 is now in state TASK_FAILED I0315 01:16:59.832191 24473 master.hpp:296] Removing task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303150116-1015726915-49340-24450-0 I0315 01:16:59.832712 24471 sched.cpp:422] Stopping framework '201303150116-1015726915-49340-24450-0000' I0315 01:16:59.833199 24475 hierarchical_allocator_process.hpp:542] Recovered cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; mem=1024; ports=[31000-32000]; disk=1024) on slave 201303150116-1015726915-49340-24450-0 from framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.833587 24470 master.cpp:742] Asked to unregister framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.834499 24470 master.cpp:477] Master terminating I0315 01:16:59.834522 24469 hierarchical_allocator_process.hpp:357] Deactivated framework 201303150116-1015726915-49340-24450-0000 I0315 01:16:59.834985 24450 master.cpp:283] Shutting down master I0315 01:16:59.835469 24469 hierarchical_allocator_process.hpp:310] Removed framework 201303150116-1015726915-49340-24450-0000 [ OK ] SlaveRecoveryTest/0.CleanupExecutor (3273 ms) [----------] 6 tests from SlaveRecoveryTest/0 (13964 ms total) [----------] 6 tests from SlaveRecoveryTest/1, where TypeParam = mesos::internal::slave::CgroupsIsolationModule [ RUN ] SlaveRecoveryTest/1.RecoverSlaveState ../../src/tests/utils.hpp:188: Failure cgroups::mount(TEST_CGROUPS_HIERARCHY, subsystems): Failed to mount 'cpu,cpuacct,memory,freezer' at '/tmp/mesos_test_cgroup': Operation not permitted ------------------------------------------------------------- We cannot run any cgroups tests that require a hierarchy with subsystems 'cpu,cpuacct,memory,freezer' because we failed to find an existing hierarchy or create a new one. You can either remove all existing hierarchies, or disable this test case (i.e., --gtest_filter=-SlaveRecoveryTest/1.*). ------------------------------------------------------------- I0315 01:16:59.838203 24471 master.cpp:309] Master started on 67.195.138.60:49340 I0315 01:16:59.838318 24471 master.cpp:324] Master ID: 201303150116-1015726915-49340-24450 I0315 01:16:59.838587 24475 slave.cpp:202] Slave started on 23)@67.195.138.60:49340 I0315 01:16:59.838809 24470 hierarchical_allocator_process.hpp:234] Initializing hierarchical allocator process with master : [email protected]:49340 W0315 01:16:59.838817 24473 master.cpp:79] No whitelist given. Advertising offers for all slaves I0315 01:16:59.838978 24471 master.cpp:571] Elected as master! I0315 01:16:59.839095 24475 slave.cpp:203] Slave resources: cpus=8; mem=15025; ports=[31000-32000]; disk=14042 Using cgroups requires root permissions I0315 01:16:59.842336 24475 slave.cpp:457] New master detected at [email protected]:49340 I0315 01:16:59.842417 24475 slave.cpp:428] Slave asked to shut down by @0.0.0.0:0 I0315 01:16:59.842871 24475 slave.cpp:390] Slave terminating I0315 01:16:59.842422 24471 status_update_manager.cpp:131] New master detected at [email protected]:49340 FAIL: mesos-tests ================== 1 of 1 test failed ================== make[3]: *** [check-TESTS] Error 1 make[3]: Leaving directory `<https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Set-JAVA_HOME/ws/build/src'> make[2]: *** [check-am] Error 2 make[2]: Leaving directory `<https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Set-JAVA_HOME/ws/build/src'> make[1]: *** [check] Error 2 make[1]: Leaving directory `<https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Set-JAVA_HOME/ws/build/src'> make: *** [check-recursive] Error 1 Process leaked file descriptors. See http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build for more information Build step 'Execute shell' marked build as failure
