----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9145/#review16819 -----------------------------------------------------------
Ship it! src/linux/cgroups.cpp <https://reviews.apache.org/r/9145/#comment35732> Please use Error here and everywhere else (you can do it in a final review request). src/slave/cgroups_isolation_module.cpp <https://reviews.apache.org/r/9145/#comment35731> PCHECK. - Benjamin Hindman On Feb. 13, 2013, 9:48 p.m., Ben Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9145/ > ----------------------------------------------------------- > > (Updated Feb. 13, 2013, 9:48 p.m.) > > > Review request for mesos, Benjamin Hindman and Vinod Kone. > > > Description > ------- > > This implements resource collection for the cgroups isolation module. > > From the redhat documentation: > https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpuacct.html > > // cpuacct.usage > // reports the total CPU time (in nanoseconds) consumed by all tasks in this > cgroup (including tasks lower in the hierarchy). > I don't like this control because it can be reset back to zero! > > // cpuacct.stat > // reports the user and system CPU time consumed by all tasks in this cgroup > (including tasks lower in the hierarchy) in the following way: > // user — CPU time consumed by tasks in user mode. > // system — CPU time consumed by tasks in system (kernel) mode. > // CPU time is reported in the units defined by the USER_HZ variable. > Since USER_HZ is typically 100, the granularity here is only 10 ms. > > // cpuacct.usage_percpu > // reports the CPU time (in nanoseconds) consumed on each CPU by all tasks in > this cgroup (including tasks lower in the hierarchy). > I don't like this control because it can be reset back to zero! > > I've used cpuacct.stat since AFAICT it can't be reset to 0. > However cpuacct.stat has somewhat low granularity, see the testing comments > below. > > > This addresses bug MESOS-324. > https://issues.apache.org/jira/browse/MESOS-324 > > > Diffs > ----- > > src/linux/cgroups.hpp 1f701f3bbbe06ddf84768c68b529aba4659c19be > src/linux/cgroups.cpp 03b31e7309b9dd65f00d3b0da2abb81ddaaeea43 > src/slave/cgroups_isolation_module.cpp > 63cefc33cf34eebb82db5d8448b751be8652fa36 > src/tests/cgroups_tests.cpp b219906374764e91f1a5268469ae92dd0fe08e53 > > Diff: https://reviews.apache.org/r/9145/diff/ > > > Testing > ------- > > Added tests for cgroups::stat. > > End to end testing using the webui. > > NOTES for cpuacct.stat: > $ cat > /cgroup/mesos/framework_201302132039-2081170186-5050-60933-0001_executor_default_tag_3e1f5310-c873-42cb-9aa4-4ee4c2b9feb8/cpuacct.usage > 4672471833 > --> 4672471833ns = 4.67 seconds > > $ cat > /cgroup/mesos/framework_201302132039-2081170186-5050-60933-0001_executor_default_tag_3e1f5310-c873-42cb-9aa4-4ee4c2b9feb8/cpuacct.usage_percpu > > 831220060 463800214 319016010 184325849 840595741 441855678 294660045 > 160799890 240361561 197829862 130045719 56978804 227972655 193743493 98604097 > 70557562 > --> > 831220060+463800214+319016010+184325849+840595741+441855678+294660045+160799890+240361561+197829862+130045719+56978804+227972655+193743493+98604097+70557562 > = 4752367240ns = 4.75 seconds > > $ cat > /cgroup/mesos/framework_201302132039-2081170186-5050-60933-0001_executor_default_tag_3e1f5310-c873-42cb-9aa4-4ee4c2b9feb8/cpuacct.stat > user 111 > system 246 > --> 1.11 + 2.46 = 3.57 seconds > > So since cpuacct.stat reveals only the user + system times, we see slightly > lower times than where the total time is displayed. I'm guessing they may be > including other cpu times? > E.g. steal, guest > > I think user + system is a good measurement. > > > Thanks, > > Ben Mahler > >
