> On Feb. 8, 2016, 10:47 p.m., Ian Downes wrote: > > src/slave/containerizer/mesos/isolators/cgroups/perf_event.cpp, lines > > 132-134 > > <https://reviews.apache.org/r/43284/diff/1/?file=1237013#file1237013line132> > > > > I think you should discard the future and let it do the correct thing > > -- kill and reap the perf process -- rather than do a blocking await. The > > perf sample interval is configurable and typical values are 10's of seconds. > > > > Note: I'm not sure if the correct behavior is implemented... > > Ian Downes wrote: > Actually, it looks like it might: > > ``` > virtual void initialize() > { > // Stop when no one cares. > promise.future().onDiscard(lambda::bind( > static_cast<void(*)(const UPID&, bool)>(terminate), self(), > true)); > > execute(); > } > > virtual void finalize() > { > // Kill the perf process (if it's still running) by sending > // SIGTERM to the signal handler which will then SIGKILL the > // perf process group created by setupChild. > if (perf.isSome() && perf->status().isPending()) { > kill(perf->pid(), SIGTERM); > } > > promise.discard(); > } > ``` > > haosdent huang wrote: > I try discard that before. > ``` > sampleFuture.get().discard(); > sampleFuture.get().await(); > ``` > But failed. It is because in `collect()` > > ``` > // Stop this nonsense if nobody cares. > promise->future().onDiscard(defer(this, &CollectProcess::discarded)); > ``` > > Every future discarded is scheduled to run after > CgroupsPerfEventIsolatorProcess finalize.
@idownes, let me discard this and submit a better patch for this problem. I record this in [MESOS-5075](http://issues.apache.org/jira/browse/MESOS-5075). - haosdent ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43284/#review118312 ----------------------------------------------------------- On Feb. 14, 2016, 7:59 a.m., haosdent huang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43284/ > ----------------------------------------------------------- > > (Updated Feb. 14, 2016, 7:59 a.m.) > > > Review request for mesos, Ian Downes, Jan Schlicht, and Paul Brett. > > > Bugs: MESOS-4655 > https://issues.apache.org/jira/browse/MESOS-4655 > > > Repository: mesos > > > Description > ------- > > Wait for perf statistics processes exit. > > > Diffs > ----- > > src/slave/containerizer/mesos/isolators/cgroups/perf_event.hpp > 65e731886b9e5cac07ae3ad6398faf8f50de5650 > src/slave/containerizer/mesos/isolators/cgroups/perf_event.cpp > 5ef4ae5c468580352cd16e7716b9ca4c0acde659 > > Diff: https://reviews.apache.org/r/43284/diff/ > > > Testing > ------- > > Without this patch, when running > ``` > sudo GLOG_v=1 ./bin/mesos-tests.sh > --gtest_filter="PerfEventIsolatorTest.ROOT_CGROUPS_Sample" --verbose > ``` > , would got this error > ``` > [----------] Global test environment tear-down > ../../src/tests/environment.cpp:732: Failure > Failed > Tests completed with child processes remaining: > -+- 16501 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests > --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose > |-+- 16580 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests > --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose > | -+- 16582 perf stat --all-cpus --field-separator , --log-fd 1 --event > cycles --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 --event task-clock > --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 -- sleep 0.25 > | --- 16584 sleep 0.25 > --- 16581 () > [==========] 1 test from 1 test case ran. (4095 ms total) > ``` > > This also fix similar error in > `MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward` and > `CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf`. > > > Thanks, > > haosdent huang > >
