[jira] [Comment Edited] (MESOS-7506) Multiple tests leave orphan containers.

2018-01-23 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336272#comment-16336272
 ] 

Andrei Budnik edited comment on MESOS-7506 at 1/23/18 7:20 PM:
---

While recovery is in progress for [the first 
slave|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/containerizer/linux_capabilities_isolator_tests.cpp#L733],
 calling 
[`StartSlave()`|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/containerizer/linux_capabilities_isolator_tests.cpp#L738]
 leads to calling 
[slave::Containerizer::create()|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/cluster.cpp#L431]
 to create a containerizer. An attempt to create a mesos c'zer, leads to 
calling 
[`cgroups::prepare`|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/slave/containerizer/mesos/linux_launcher.cpp#L124].
 Finally, we get to the point, where we try to create a ["test" 
container|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/linux/cgroups.cpp#L476].
 So, the recovery process for the first slave [might 
detect|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/slave/containerizer/mesos/linux_launcher.cpp#L268-L301]
 this "test" container as an orphaned container.

So, there is the race between recovery process for the first slave and an 
attempt to create a c'zer for the second agent.


was (Author: abudnik):
While recovery is in progress for [the first 
slave|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/containerizer/linux_capabilities_isolator_tests.cpp#L733],
 calling 
[`StartSlave()`|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/containerizer/linux_capabilities_isolator_tests.cpp#L738]
 leads to calling 
[slave::Containerizer::create()|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/tests/cluster.cpp#L431]
 to create a containerizer. An attempt to create a mesos c'zer, leads to 
calling 
[`cgroups::prepare`|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/slave/containerizer/mesos/linux_launcher.cpp#L124].
 Finally, we get to the point, where we try to create a ["test" 
container|[https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/linux/cgroups.cpp#L476].]
 So, the recovery process for the first slave [might 
detect|https://github.com/apache/mesos/blob/ce0905fcb31a10ade0962a89235fa90b01edf01a/src/slave/containerizer/mesos/linux_launcher.cpp#L268-L301]
 this "test" container as an orphaned container.

So, there is the race between recovery process for the first slave and an 
attempt to create a c'zer for the second agent.

> Multiple tests leave orphan containers.
> ---
>
> Key: MESOS-7506
> URL: https://issues.apache.org/jira/browse/MESOS-7506
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 16.04
> Fedora 23
> other Linux distros
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>Priority: Major
>  Labels: containerizer, flaky-test, mesosphere
> Attachments: KillMultipleTasks-badrun.txt, 
> ROOT_IsolatorFlags-badrun.txt, ROOT_IsolatorFlags-badrun2.txt, 
> ROOT_IsolatorFlags-badrun3.txt, ReconcileTasksMissingFromSlave-badrun.txt, 
> ResourceLimitation-badrun.txt, ResourceLimitation-badrun2.txt, 
> RestartSlaveRequireExecutorAuthentication-badrun.txt, 
> TaskWithFileURI-badrun.txt
>
>
> I've observed a number of flaky tests that leave orphan containers upon 
> cleanup. A typical log looks like this:
> {noformat}
> ../../src/tests/cluster.cpp:580: Failure
> Value of: containers->empty()
>   Actual: false
> Expected: true
> Failed to destroy containers: { da3e8aa8-98e7-4e72-a8fd-5d0bae960014 }
> {noformat}
> All currently affected tests:
> {noformat}
> SlaveTest.RestartSlaveRequireExecutorAuthentication // cannot reproduce any 
> more
> ROOT_IsolatorFlags
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-7506) Multiple tests leave orphan containers.

2017-11-08 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225515#comment-16225515
 ] 

Andrei Budnik edited comment on MESOS-7506 at 11/8/17 11:01 AM:


*First cause*

Some tests (from {{SlaveTest}} and {{SlaveRecoveryTest}}) have a pattern [like 
this|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/tests/slave_tests.cpp#L393-L406],
 where the clock is advanced by {{executor_registration_timeout}} and then it 
waits in a loop until a task status update is sent. This loop is executing 
while the container is being destroyed. At the same time, container destruction 
consists of multiple steps, one of them waits for [cgroups 
destruction|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/slave/containerizer/mesos/linux_launcher.cpp#L567].
 That means, we have a race between container destruction process and the loop 
that advances the clock, leading to the following outcomes:
#  Container completely destroyed, before clock advancing reaches timeout (e.g. 
{{cgroups::DESTROY_TIMEOUT}}).
# Triggered timeout due to clock advancing, before container destruction 
completes. That results in [leaving 
orphaned|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/slave/containerizer/mesos/containerizer.cpp#L2367-L2380]
 containers that will be detected by [Slave 
destructor|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/tests/cluster.cpp#L559-L584]
 in `tests/cluster.cpp`, so the test will fail.

The issue is easily reproduced by advancing the clocks by 60 seconds or more in 
the loop, which waits for a status update.


was (Author: abudnik):
Some tests (from {{SlaveTest}} and {{SlaveRecoveryTest}}) have a pattern [like 
this|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/tests/slave_tests.cpp#L393-L406],
 where the clock is advanced by {{executor_registration_timeout}} and then it 
waits in a loop until a task status update is sent. This loop is executing 
while the container is being destroyed. At the same time, container destruction 
consists of multiple steps, one of them waits for [cgroups 
destruction|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/slave/containerizer/mesos/linux_launcher.cpp#L567].
 That means, we have a race between container destruction process and the loop 
that advances the clock, leading to the following outcomes:
#  Container completely destroyed, before clock advancing reaches timeout (e.g. 
{{cgroups::DESTROY_TIMEOUT}}).
# Triggered timeout due to clock advancing, before container destruction 
completes. That results in [leaving 
orphaned|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/slave/containerizer/mesos/containerizer.cpp#L2367-L2380]
 containers that will be detected by [Slave 
destructor|https://github.com/apache/mesos/blob/ff01d0c44251e2ffaa2f4f47b33c790594d194d9/src/tests/cluster.cpp#L559-L584]
 in `tests/cluster.cpp`, so the test will fail.

The issue is easily reproduced by advancing the clocks by 60 seconds or more in 
the loop, which waits for a status update.

> Multiple tests leave orphan containers.
> ---
>
> Key: MESOS-7506
> URL: https://issues.apache.org/jira/browse/MESOS-7506
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 16.04
> Fedora 23
> other Linux distros
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>  Labels: containerizer, flaky-test, mesosphere
> Attachments: KillMultipleTasks-badrun.txt, 
> ResourceLimitation-badrun.txt, TaskWithFileURI-badrun.txt
>
>
> I've observed a number of flaky tests that leave orphan containers upon 
> cleanup. A typical log looks like this:
> {noformat}
> ../../src/tests/cluster.cpp:580: Failure
> Value of: containers->empty()
>   Actual: false
> Expected: true
> Failed to destroy containers: { da3e8aa8-98e7-4e72-a8fd-5d0bae960014 }
> {noformat}
> All currently affected tests:
> {noformat}
> ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillTask/0
> ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI/0
> ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.ResourceLimitation/0
> ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillMultipleTasks/0
> SlaveRecoveryTest/0.RecoverUnregisteredExecutor
> SlaveRecoveryTest/0.CleanupExecutor
> SlaveRecoveryTest/0.RecoverTerminatedExecutor
> SlaveTest.ShutdownUnregisteredExecutor
> ShutdownUnregisteredExecutor
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7506) Multiple tests leave orphan containers.

2017-10-20 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212605#comment-16212605
 ] 

Andrei Budnik edited comment on MESOS-7506 at 10/20/17 6:30 PM:


Bug has been reproduced with extra debug logs 
(SlaveTest.ShutdownUnregisteredExecutor):
{code}
I1020 17:59:05.049862 16817 cgroups.cpp:1563] TasksKiller::freeze: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.049876 16817 cgroups.cpp:3085] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.050351 16817 cgroups.cpp:1398] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.051440 16817 cgroups.cpp:1423] Freezer::freeze 2: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7: FREEZING
I1020 17:59:05.051749 16819 cgroups.cpp:1398] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.052760 16819 cgroups.cpp:1416] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7 after 1secs
I1020 17:59:05.052858 16819 hierarchical.cpp:1488] Performed allocation for 1 
agents in 15715ns
I1020 17:59:05.052901 16819 cgroups.cpp:1574] TasksKiller::kill: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.053357 16819 cgroups.cpp:1584] TasksKiller::kill: reap: 31229
I1020 17:59:05.053377 16819 cgroups.cpp:1584] TasksKiller::kill: reap: 31243
I1020 17:59:05.054193 16819 cgroups.cpp:928] cgroups::kill: 31229
I1020 17:59:05.054206 16819 cgroups.cpp:928] cgroups::kill: 31243
I1020 17:59:05.054262 16819 cgroups.cpp:1598] TasksKiller::thaw: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.054272 16819 cgroups.cpp:3103] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.054757 16819 cgroups.cpp:1432] Freezer::thaw: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
I1020 17:59:05.057647 16819 cgroups.cpp:1449] Successfully thawed cgroup 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7 after 0ns
I1020 17:59:05.057842 16816 cgroups.cpp:1604] TasksKiller::reap: 
/sys/fs/cgroup/freezer/mesos/57152291-a86c-446d-8bd3-eb5c60dfefd7
{code}
{{TasksKiller::finished}} wasn't called, while {{TasksKiller::reap}} was called.


was (Author: abudnik):
Bug has been reproduced with extra debug logs 
(SlaveTest.ShutdownUnregisteredExecutor):
{code}
I1020 12:07:20.266032  9274 containerizer.cpp:2220] Destroying container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542 in RUNNING state
I1020 12:07:20.266042  9274 containerizer.cpp:2784] Transitioning the state of 
container 7f9cb5a6-26c9-4010-ace9-b9cb3e065542 from RUNNING to DESTROYING
I1020 12:07:20.266175  9274 linux_launcher.cpp:514] Asked to destroy container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.266717  9274 linux_launcher.cpp:560] Using freezer to destroy 
cgroup mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268649  9274 cgroups.cpp:1562] TasksKiller::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268756  9274 cgroups.cpp:3083] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.269533  9276 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.270486  9276 cgroups.cpp:1422] Freezer::freeze 2: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542: FREEZING
I1020 12:07:20.270725  9272 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.271625  9272 cgroups.cpp:1415] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542 after 1secs
I1020 12:07:20.271724  9272 hierarchical.cpp:1488] Performed allocation for 1 
agents in 18541ns
I1020 12:07:20.271767  9272 cgroups.cpp:1573] TasksKiller::kill: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273386  9272 cgroups.cpp:1596] TasksKiller::thaw: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273486  9272 cgroups.cpp:3101] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.274129  9272 cgroups.cpp:1431] Freezer::thaw: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.276964  9272 cgroups.cpp:1448] Successfully thawed cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542 after 0ns
I1020 12:07:20.277225  9277 cgroups.cpp:1602] TasksKiller::reap: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.277613  9279 hierarchical.cpp:1488] Performed allocation for 1 
agents in 17680ns
I1020 12:07:20.22  9279 containerizer.cpp:2671] Container 

[jira] [Comment Edited] (MESOS-7506) Multiple tests leave orphan containers.

2017-10-20 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212605#comment-16212605
 ] 

Andrei Budnik edited comment on MESOS-7506 at 10/20/17 6:28 PM:


Bug has been reproduced with extra debug logs 
(SlaveTest.ShutdownUnregisteredExecutor):
{code}
I1020 12:07:20.266032  9274 containerizer.cpp:2220] Destroying container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542 in RUNNING state
I1020 12:07:20.266042  9274 containerizer.cpp:2784] Transitioning the state of 
container 7f9cb5a6-26c9-4010-ace9-b9cb3e065542 from RUNNING to DESTROYING
I1020 12:07:20.266175  9274 linux_launcher.cpp:514] Asked to destroy container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.266717  9274 linux_launcher.cpp:560] Using freezer to destroy 
cgroup mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268649  9274 cgroups.cpp:1562] TasksKiller::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268756  9274 cgroups.cpp:3083] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.269533  9276 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.270486  9276 cgroups.cpp:1422] Freezer::freeze 2: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542: FREEZING
I1020 12:07:20.270725  9272 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.271625  9272 cgroups.cpp:1415] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542 after 1secs
I1020 12:07:20.271724  9272 hierarchical.cpp:1488] Performed allocation for 1 
agents in 18541ns
I1020 12:07:20.271767  9272 cgroups.cpp:1573] TasksKiller::kill: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273386  9272 cgroups.cpp:1596] TasksKiller::thaw: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273486  9272 cgroups.cpp:3101] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.274129  9272 cgroups.cpp:1431] Freezer::thaw: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.276964  9272 cgroups.cpp:1448] Successfully thawed cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542 after 0ns
I1020 12:07:20.277225  9277 cgroups.cpp:1602] TasksKiller::reap: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.277613  9279 hierarchical.cpp:1488] Performed allocation for 1 
agents in 17680ns
I1020 12:07:20.22  9279 containerizer.cpp:2671] Container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542 has exited
{code}
{{TasksKiller::finished}} wasn't called, while {{TasksKiller::reap}} was called.


was (Author: abudnik):
Bug has been reproduced with extra debug logs 
(SlaveTest.ShutdownUnregisteredExecutor):
{code}
I1020 12:07:20.266032  9274 containerizer.cpp:2220] Destroying container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542 in RUNNING state
I1020 12:07:20.266042  9274 containerizer.cpp:2784] Transitioning the state of 
container 7f9cb5a6-26c9-4010-ace9-b9cb3e065542 from RUNNING to DESTROYING
I1020 12:07:20.266175  9274 linux_launcher.cpp:514] Asked to destroy container 
7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.266717  9274 linux_launcher.cpp:560] Using freezer to destroy 
cgroup mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268649  9274 cgroups.cpp:1562] TasksKiller::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.268756  9274 cgroups.cpp:3083] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.269533  9276 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.270486  9276 cgroups.cpp:1422] Freezer::freeze 2: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542: FREEZING
I1020 12:07:20.270725  9272 cgroups.cpp:1397] Freezer::freeze: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.271625  9272 cgroups.cpp:1415] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542 after 1secs
I1020 12:07:20.271724  9272 hierarchical.cpp:1488] Performed allocation for 1 
agents in 18541ns
I1020 12:07:20.271767  9272 cgroups.cpp:1573] TasksKiller::kill: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273386  9272 cgroups.cpp:1596] TasksKiller::thaw: 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.273486  9272 cgroups.cpp:3101] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos/7f9cb5a6-26c9-4010-ace9-b9cb3e065542
I1020 12:07:20.274129  9272 cgroups.cpp:1431] Freezer::thaw: