> On Feb. 8, 2016, 10:47 p.m., Ian Downes wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/perf_event.cpp, lines
> > 132-134
> > <https://reviews.apache.org/r/43284/diff/1/?file=1237013#file1237013line132>
> >
> > I think you should discard the future and let it do the correct thing
> > -- kill and reap the perf process -- rather than do a blocking await. The
> > perf sample interval is configurable and typical values are 10's of seconds.
> >
> > Note: I'm not sure if the correct behavior is implemented...
>
> Ian Downes wrote:
> Actually, it looks like it might:
>
> ```
> virtual void initialize()
> {
> // Stop when no one cares.
> promise.future().onDiscard(lambda::bind(
> static_cast<void(*)(const UPID&, bool)>(terminate), self(),
> true));
>
> execute();
> }
>
> virtual void finalize()
> {
> // Kill the perf process (if it's still running) by sending
> // SIGTERM to the signal handler which will then SIGKILL the
> // perf process group created by setupChild.
> if (perf.isSome() && perf->status().isPending()) {
> kill(perf->pid(), SIGTERM);
> }
>
> promise.discard();
> }
> ```
I try discard that before.
```
sampleFuture.get().discard();
sampleFuture.get().await();
```
But failed. It is because in `collect()`
```
// Stop this nonsense if nobody cares.
promise->future().onDiscard(defer(this, &CollectProcess::discarded));
```
Every future discarded is scheduled to run after
CgroupsPerfEventIsolatorProcess finalize.
- haosdent
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43284/#review118312
-----------------------------------------------------------
On Feb. 14, 2016, 7:59 a.m., haosdent huang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43284/
> -----------------------------------------------------------
>
> (Updated Feb. 14, 2016, 7:59 a.m.)
>
>
> Review request for mesos, Ian Downes, Jan Schlicht, and Paul Brett.
>
>
> Bugs: MESOS-4655
> https://issues.apache.org/jira/browse/MESOS-4655
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Wait for perf statistics processes exit.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/isolators/cgroups/perf_event.hpp
> 65e731886b9e5cac07ae3ad6398faf8f50de5650
> src/slave/containerizer/mesos/isolators/cgroups/perf_event.cpp
> 5ef4ae5c468580352cd16e7716b9ca4c0acde659
>
> Diff: https://reviews.apache.org/r/43284/diff/
>
>
> Testing
> -------
>
> Without this patch, when running
> ```
> sudo GLOG_v=1 ./bin/mesos-tests.sh
> --gtest_filter="PerfEventIsolatorTest.ROOT_CGROUPS_Sample" --verbose
> ```
> , would got this error
> ```
> [----------] Global test environment tear-down
> ../../src/tests/environment.cpp:732: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 16501 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests
> --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
> |-+- 16580 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests
> --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
> | -+- 16582 perf stat --all-cpus --field-separator , --log-fd 1 --event
> cycles --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 --event task-clock
> --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 -- sleep 0.25
> | --- 16584 sleep 0.25
> --- 16581 ()
> [==========] 1 test from 1 test case ran. (4095 ms total)
> ```
>
> This also fix similar error in
> `MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward` and
> `CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf`.
>
>
> Thanks,
>
> haosdent huang
>
>