Ian Downes created MESOS-2103:
---------------------------------
Summary: Expose number and state of tasks in a container
Key: MESOS-2103
URL: https://issues.apache.org/jira/browse/MESOS-2103
Project: Mesos
Issue Type: Improvement
Components: isolation
Affects Versions: 0.20.0
Reporter: Ian Downes
The CFS cpu statistics (cpus_nr_throttled, cpus_nr_periods,
cpus_throttled_time) are difficult to interpret.
1) nr_throttled is the number of intervals where *any* throttling occurred
2) throttled_time is the aggregate time *across all runnable tasks* (tasks in
the Linux sense).
For example, in a typical 60 second sampling interval: nr_periods = 600,
nr_throttled could be 60, i.e., 10% of intervals, but throttled_time could be
much higher than (60/600) * 60 = 6 seconds if there is more than one task that
is runnable but throttled. *Each* throttled task contributes to the total
throttled time.
Small test to demonstrate throttled_time > nr_periods * quota_interval:
5 x {{'openssl speed'}} running with quota=100ms:
{noformat}
cat cpu.stat && sleep 1 && cat cpu.stat
nr_periods 3228
nr_throttled 1276
throttled_time 528843772540
nr_periods 3238
nr_throttled 1286
throttled_time 531668964667
{noformat}
All 10 intervals throttled (100%) for total time of 2.8 seconds in 1 second
("more than 100%" of the time interval)
It would be helpful to expose the number and state of tasks in the container
cgroup. This would be at a very coarse granularity but would give some guidance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)