[
https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167418#comment-15167418
]
Alexander Rojas commented on MESOS-4047:
----------------------------------------
So after fixing the issues raised in previous comments, I managed to reproduce
the issue mentioned in the logs posted here. Apparently there is yet another
race, where the executor exits before the line {{Future<ResourceStatistics>
usage = containerizer2.get()->usage(containerId);}}. I managed to collect two
verbose logs for a good and a bad run. I add only the important sections. Pay
attention to lines which look like {{I0224 13:53:53.169703 25060
slave.cpp:3528] executor(1)@127.0.0.1:38732 exited}}
The good run:
{noformat}
...
I0224 13:53:52.219846 25063 slave.cpp:1891] Asked to kill task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
Received killTask
Shutting down
Sending SIGTERM to process tree at pid 31659
Sent SIGTERM to the following process trees:
[
-+- 31659 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done
\--- 31661 dd count=512 bs=1M if=/dev/zero of=./temp
]
Command terminated with signal Terminated (pid: 31659)
I0224 13:53:52.369876 25062 slave.cpp:3002] Handling status update TASK_KILLED
(UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 from executor(1)@127.0.0.1:38732
I0224 13:53:52.386056 25059 mem.cpp:353] Updated 'memory.soft_limit_in_bytes'
to 32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.113471 25059 mem.cpp:388] Updated 'memory.limit_in_bytes' to
32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.117938 25059 status_update_manager.cpp:320] Received status
update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.118013 25059 status_update_manager.cpp:824] Checkpointing UPDATE
for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for
task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.146458 25058 slave.cpp:3400] Forwarding the update TASK_KILLED
(UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 to [email protected]:57058
I0224 13:53:53.146702 25058 slave.cpp:3310] Sending acknowledgement for status
update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 to executor(1)@127.0.0.1:38732
I0224 13:53:53.147956 25062 master.cpp:4794] Status update TASK_KILLED (UUID:
4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 from slave
92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058
(localhost)
I0224 13:53:53.147989 25062 master.cpp:4842] Forwarding status update
TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.148143 25062 master.cpp:6450] Updating the state of task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 (latest state: TASK_KILLED, status
update state: TASK_KILLED)
I0224 13:53:53.149320 25061 master.cpp:3952] Processing ACKNOWLEDGE call
4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8 for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 (default) at
[email protected]:57058 on slave
92632338-e777-41c7-a9a3-39dc62fdea4c-S0
I0224 13:53:53.149684 25061 master.cpp:6516] Removing task
21236fe6-f5b3-4647-b4b0-fd83827436a3 with resources cpus(*):1; mem(*):256;
disk(*):1024 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c-0000 on slave
92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058
(localhost)
I0224 13:53:53.150146 25061 status_update_manager.cpp:392] Received status
update acknowledgement (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task
21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.150410 25061 status_update_manager.cpp:824] Checkpointing ACK
for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for
task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.153118 25056 sched.cpp:1903] Asked to stop the driver
I0224 13:53:53.153228 25064 sched.cpp:1143] Stopping framework
'92632338-e777-41c7-a9a3-39dc62fdea4c-0000'
I0224 13:53:53.154057 25061 master.cpp:5926] Processing TEARDOWN call for
framework 92632338-e777-41c7-a9a3-39dc62fdea4c-0000 (default) at
[email protected]:57058
I0224 13:53:53.154201 25061 master.cpp:5938] Removing framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 (default) at
[email protected]:57058
I0224 13:53:53.154716 25062 slave.cpp:2079] Asked to shut down framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 by [email protected]:57058
I0224 13:53:53.154887 25062 slave.cpp:2104] Shutting down framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.154953 25062 slave.cpp:4198] Shutting down executor
'21236fe6-f5b3-4647-b4b0-fd83827436a3' of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 at executor(1)@127.0.0.1:38732
I0224 13:53:53.154963 25061 master.cpp:1027] Master terminating
I0224 13:53:53.155953 31653 exec.cpp:390] Executor asked to shutdown
I0224 13:53:53.156373 25061 slave.cpp:3528] [email protected]:57058 exited
W0224 13:53:53.156425 25061 slave.cpp:3531] Master disconnected! Waiting for a
new master to be elected
I0224 13:53:53.157037 25057 hierarchical.cpp:375] Deactivated framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.157402 25057 hierarchical.cpp:326] Removed framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.160271 25062 containerizer.cpp:1378] Destroying container
'd78a1f77-a3a1-44e4-9898-a62523a1c1e0'
I0224 13:53:53.162210 25062 cgroups.cpp:2427] Freezing cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9/d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.163861 25059 cgroups.cpp:1409] Successfully froze cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9/d78a1f77-a3a1-44e4-9898-a62523a1c1e0
after 1.487104ms
I0224 13:53:53.165483 25060 cgroups.cpp:2445] Thawing cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9/d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.167999 25059 cgroups.cpp:1438] Successfullly thawed cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9/d78a1f77-a3a1-44e4-9898-a62523a1c1e0
after 2.372864ms
I0224 13:53:53.169703 25060 slave.cpp:3528] executor(1)@127.0.0.1:38732 exited
I0224 13:53:53.226868 25058 containerizer.cpp:1594] Executor for container
'd78a1f77-a3a1-44e4-9898-a62523a1c1e0' has exited
I0224 13:53:53.339517 25057 provisioner.cpp:306] Ignoring destroy request for
unknown container d78a1f77-a3a1-44e4-9898-a62523a1c1e0
I0224 13:53:53.340080 25063 slave.cpp:3886] Executor
'21236fe6-f5b3-4647-b4b0-fd83827436a3' of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 terminated with signal Killed
I0224 13:53:53.340206 25063 slave.cpp:3990] Cleaning up executor
'21236fe6-f5b3-4647-b4b0-fd83827436a3' of framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000 at executor(1)@127.0.0.1:38732
I0224 13:53:53.340931 25059 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000/executors/21236fe6-f5b3-4647-b4b0-fd83827436a3/runs/d78a1f77-a3a1-44e4-9898-a62523a1c1e0'
for gc 6.99999605990815days in the future
I0224 13:53:53.341127 25063 slave.cpp:4078] Cleaning up framework
92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.341518 25059 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000/executors/21236fe6-f5b3-4647-b4b0-fd83827436a3'
for gc 6.9999960536days in the future
I0224 13:53:53.341814 25059 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/meta/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000/executors/21236fe6-f5b3-4647-b4b0-fd83827436a3/runs/d78a1f77-a3a1-44e4-9898-a62523a1c1e0'
for gc 6.99999605247704days in the future
I0224 13:53:53.342157 25059 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/meta/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000/executors/21236fe6-f5b3-4647-b4b0-fd83827436a3'
for gc 6.99999605214222days in the future
I0224 13:53:53.342463 25060 status_update_manager.cpp:282] Closing status
update streams for framework 92632338-e777-41c7-a9a3-39dc62fdea4c-0000
I0224 13:53:53.342669 25059 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000'
for gc 6.99999603612148days in the future
I0224 13:53:53.343171 25063 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_JVQWxS/meta/slaves/92632338-e777-41c7-a9a3-39dc62fdea4c-S0/frameworks/92632338-e777-41c7-a9a3-39dc62fdea4c-0000'
for gc 6.99999603449185days in the future
I0224 13:53:53.343410 25056 slave.cpp:668] Slave terminating
I0224 13:53:53.349254 25059 cgroups.cpp:2427] Freezing cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9
I0224 13:53:53.350904 25059 cgroups.cpp:1409] Successfully froze cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9 after 1.454848ms
I0224 13:53:53.352203 25059 cgroups.cpp:2445] Thawing cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9
I0224 13:53:53.353524 25060 cgroups.cpp:1438] Successfullly thawed cgroup
/cgroup/freezer/mesos_test_82cf0b7f-b476-49b0-bfbb-42f4dd0110e9 after 1.264128ms
[ OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (4261 ms)
{noformat}
The bad run (the segfault is related to the break on failure flag):
{noformat}
...
I0224 13:53:56.421175 25057 slave.cpp:1891] Asked to kill task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
Received killTask
Shutting down
Sending SIGTERM to process tree at pid 31706
Sent SIGTERM to the following process trees:
[
-+- 31706 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done
\--- 31708 dd count=512 bs=1M if=/dev/zero of=./temp
]
Command terminated with signal Terminated (pid: 31706)
I0224 13:53:56.576330 25064 slave.cpp:3002] Handling status update TASK_KILLED
(UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 from executor(1)@127.0.0.1:52634
I0224 13:53:56.649286 25061 mem.cpp:353] Updated 'memory.soft_limit_in_bytes'
to 32MB for container 664b1778-5e51-432d-8e66-a6f275dc6d80
I0224 13:53:57.599311 25063 slave.cpp:3528] executor(1)@127.0.0.1:52634 exited
I0224 13:53:57.684360 25059 containerizer.cpp:1594] Executor for container
'664b1778-5e51-432d-8e66-a6f275dc6d80' has exited
I0224 13:53:57.688079 25059 containerizer.cpp:1378] Destroying container
'664b1778-5e51-432d-8e66-a6f275dc6d80'
I0224 13:53:57.704903 25057 cgroups.cpp:2427] Freezing cgroup
/cgroup/freezer/mesos_test_8f53be60-0c43-42da-9210-2d9ec670cd8b/664b1778-5e51-432d-8e66-a6f275dc6d80
I0224 13:53:57.715003 25057 cgroups.cpp:1409] Successfully froze cgroup
/cgroup/freezer/mesos_test_8f53be60-0c43-42da-9210-2d9ec670cd8b/664b1778-5e51-432d-8e66-a6f275dc6d80
after 9.914112ms
I0224 13:53:57.745836 25057 cgroups.cpp:2445] Thawing cgroup
/cgroup/freezer/mesos_test_8f53be60-0c43-42da-9210-2d9ec670cd8b/664b1778-5e51-432d-8e66-a6f275dc6d80
I0224 13:53:57.778103 25058 cgroups.cpp:1438] Successfullly thawed cgroup
/cgroup/freezer/mesos_test_8f53be60-0c43-42da-9210-2d9ec670cd8b/664b1778-5e51-432d-8e66-a6f275dc6d80
after 29.500928ms
I0224 13:53:57.841120 25061 mem.cpp:388] Updated 'memory.limit_in_bytes' to
32MB for container 664b1778-5e51-432d-8e66-a6f275dc6d80
I0224 13:53:57.852152 25062 status_update_manager.cpp:320] Received status
update TASK_KILLED (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
I0224 13:53:57.856850 25062 status_update_manager.cpp:824] Checkpointing UPDATE
for status update TASK_KILLED (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for
task 3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
I0224 13:53:57.960736 25058 provisioner.cpp:306] Ignoring destroy request for
unknown container 664b1778-5e51-432d-8e66-a6f275dc6d80
I0224 13:53:57.967319 25058 slave.cpp:3886] Executor
'3dd870d0-aa26-47fc-b647-dcf95ef87e06' of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 exited with status 0
I0224 13:53:57.996193 25058 slave.cpp:3400] Forwarding the update TASK_KILLED
(UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 to [email protected]:57058
I0224 13:53:57.997020 25058 slave.cpp:3310] Sending acknowledgement for status
update TASK_KILLED (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 to executor(1)@127.0.0.1:52634
I0224 13:53:57.997710 25059 master.cpp:4794] Status update TASK_KILLED (UUID:
70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 from slave
21daeed7-a99d-4c93-bc94-956e8e381ab9-S0 at slave(280)@127.0.0.1:57058
(localhost)
I0224 13:53:57.997799 25059 master.cpp:4842] Forwarding status update
TASK_KILLED (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
I0224 13:53:57.998181 25059 master.cpp:6450] Updating the state of task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 (latest state: TASK_KILLED, status
update state: TASK_KILLED)
E0224 13:53:57.999061 25058 process.cpp:1963] Failed to shutdown socket with fd
392: Transport endpoint is not connected
I0224 13:53:57.999202 25064 master.cpp:3952] Processing ACKNOWLEDGE call
70055746-96ac-427d-9a40-df962a06ad51 for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 (default) at
[email protected]:57058 on slave
21daeed7-a99d-4c93-bc94-956e8e381ab9-S0
I0224 13:53:57.999512 25064 master.cpp:6516] Removing task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 with resources cpus(*):1; mem(*):256;
disk(*):1024 of framework 21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 on slave
21daeed7-a99d-4c93-bc94-956e8e381ab9-S0 at slave(280)@127.0.0.1:57058
(localhost)
../../src/tests/containerizer/memory_pressure_tests.cpp:322: Failure
(usage).failure(): Unknown container: 664b1778-5e51-432d-8e66-a6f275dc6d80
*** Aborted at 1456350838 (unix time) try "date -d @1456350838" if you are
using GNU date ***
I0224 13:53:58.006049 25060 status_update_manager.cpp:392] Received status
update acknowledgement (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for task
3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
PC: @ 0x1675fa0 testing::UnitTest::AddTestPartResult()
I0224 13:53:58.010462 25060 status_update_manager.cpp:824] Checkpointing ACK
for status update TASK_KILLED (UUID: 70055746-96ac-427d-9a40-df962a06ad51) for
task 3dd870d0-aa26-47fc-b647-dcf95ef87e06 of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
*** SIGSEGV (@0x0) received by PID 25056 (TID 0x7faace5ed840) from PID 0; stack
trace: ***
@ 0x7faac6780790 (unknown)
@ 0x1675fa0 testing::UnitTest::AddTestPartResult()
@ 0x166a9d9 testing::internal::AssertHelper::operator=()
I0224 13:53:58.042770 25057 slave.cpp:3990] Cleaning up executor
'3dd870d0-aa26-47fc-b647-dcf95ef87e06' of framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000 at executor(1)@127.0.0.1:52634
I0224 13:53:58.051208 25057 slave.cpp:4078] Cleaning up framework
21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
I0224 13:53:58.051672 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000/executors/3dd870d0-aa26-47fc-b647-dcf95ef87e06/runs/664b1778-5e51-432d-8e66-a6f275dc6d80'
for gc 6.99999942607704days in the future
I0224 13:53:58.051944 25060 status_update_manager.cpp:282] Closing status
update streams for framework 21daeed7-a99d-4c93-bc94-956e8e381ab9-0000
I0224 13:53:58.052243 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000/executors/3dd870d0-aa26-47fc-b647-dcf95ef87e06'
for gc 6.99999942398222days in the future
I0224 13:53:58.052500 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/meta/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000/executors/3dd870d0-aa26-47fc-b647-dcf95ef87e06/runs/664b1778-5e51-432d-8e66-a6f275dc6d80'
for gc 6.99999942333333days in the future
I0224 13:53:58.056956 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/meta/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000/executors/3dd870d0-aa26-47fc-b647-dcf95ef87e06'
for gc 6.99999940888889days in the future
I0224 13:53:58.057041 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000'
for gc 6.99999940421333days in the future
I0224 13:53:58.057101 25057 gc.cpp:54] Scheduling
'/tmp/MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_uw6mf9/meta/slaves/21daeed7-a99d-4c93-bc94-956e8e381ab9-S0/frameworks/21daeed7-a99d-4c93-bc94-956e8e381ab9-0000'
for gc 6.99999940354074days in the future54074days in the future
@ 0x164b847
mesos::internal::tests::MemoryPressureMesosTest_CGROUPS_ROOT_SlaveRecovery_Test::TestBody()
@ 0x1693a00
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x168ea2e
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x166fd79 testing::Test::Run()
@ 0x1670507 testing::TestInfo::Run()
@ 0x1670b42 testing::TestCase::Run()
@ 0x1677491 testing::internal::UnitTestImpl::RunAllTests()
@ 0x169468f
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x168f5ba
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x16761c1 testing::UnitTest::Run()
@ 0xe1da5e RUN_ALL_TESTS()
@ 0xe1d674 main
@ 0x7faac55e8d5d __libc_start_main
@ 0x9a56b9 (unknown)
/var/tmp/sclnbI5N2: line 8: 25056 Segmentation fault './.libs/mesos-tests'
'--gtest_filter=MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery'
'--gtest_repeat=1000' '--gtest_break_on_failure'
{noformat}
> MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
> -----------------------------------------------------------
>
> Key: MESOS-4047
> URL: https://issues.apache.org/jira/browse/MESOS-4047
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 0.26.0
> Environment: Ubuntu 14, gcc 4.8.4
> Reporter: Joseph Wu
> Assignee: Alexander Rojas
> Labels: flaky, flaky-test
> Fix For: 0.28.0
>
>
> {code:title=Output from passed test}
> [----------] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000430889 s, 2.4 GB/s
> [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:14.319327 5062 exec.cpp:134] Version: 0.27.0
> I1202 11:09:14.333317 5079 exec.cpp:208] Executor registered on slave
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Registered executor on ubuntu
> Starting task 4e62294c-cfcf-4a13-b699-c6a4b7ac5162
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> Forked command at 5085
> I1202 11:09:14.391739 5077 exec.cpp:254] Received reconnect request from
> slave bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> I1202 11:09:14.398598 5082 exec.cpp:231] Executor re-registered on slave
> bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0
> Re-registered executor on ubuntu
> Shutting down
> Sending SIGTERM to process tree at pid 5085
> Killing the following process trees:
> [
> -+- 5085 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done
> \--- 5086 dd count=512 bs=1M if=/dev/zero of=./temp
> ]
> [ OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (1096 ms)
> {code}
> {code:title=Output from failed test}
> [----------] 1 test from MemoryPressureMesosTest
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.000404489 s, 2.6 GB/s
> [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> I1202 11:09:15.509950 5109 exec.cpp:134] Version: 0.27.0
> I1202 11:09:15.568183 5123 exec.cpp:208] Executor registered on slave
> 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> Registered executor on ubuntu
> Starting task 14b6bab9-9f60-4130-bdc4-44efba262bc6
> Forked command at 5132
> sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done'
> I1202 11:09:15.665498 5129 exec.cpp:254] Received reconnect request from
> slave 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0
> I1202 11:09:15.670995 5123 exec.cpp:381] Executor asked to shutdown
> Shutting down
> Sending SIGTERM to process tree at pid 5132
> ../../src/tests/containerizer/memory_pressure_tests.cpp:283: Failure
> (usage).failure(): Unknown container: ebe90e15-72fa-4519-837b-62f43052c913
> *** Aborted at 1449083355 (unix time) try "date -d @1449083355" if you are
> using GNU date ***
> {code}
> Notice that in the failed test, the executor is asked to shutdown when it
> tries to reconnect to the agent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)