[jira] [Updated] (MESOS-2103) Expose number of processes and threads in a container

Chi Zhang (JIRA) Wed, 18 Feb 2015 12:17:11 -0800

     [ 
https://issues.apache.org/jira/browse/MESOS-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chi Zhang updated MESOS-2103:
-----------------------------
    Description: 
The CFS cpu statistics (cpus_nr_throttled, cpus_nr_periods, 
cpus_throttled_time) are difficult to interpret.
1) nr_throttled is the number of intervals where *any* throttling occurred
2) throttled_time is the aggregate time *across all runnable tasks* (tasks in 
the Linux sense).

For example, in a typical 60 second sampling interval: nr_periods = 600, 
nr_throttled could be 60, i.e., 10% of intervals, but throttled_time could be 
much higher than (60/600) * 60 = 6 seconds if there is more than one task that 
is runnable but throttled. *Each* throttled task contributes to the total 
throttled time.

Small test to demonstrate throttled_time > nr_periods * quota_interval:

5 x {{'openssl speed'}} running with quota=100ms:
{noformat}
cat cpu.stat && sleep 1 && cat cpu.stat
nr_periods 3228
nr_throttled 1276
throttled_time 528843772540
nr_periods 3238
nr_throttled 1286
throttled_time 531668964667
{noformat}
All 10 intervals throttled (100%) for total time of 2.8 seconds in 1 second 
("more than 100%" of the time interval)


It would be helpful to expose the number of processes and tasks in the 
container cgroup. This would be at a very coarse granularity but would give 
some guidance.

  was:
The CFS cpu statistics (cpus_nr_throttled, cpus_nr_periods, 
cpus_throttled_time) are difficult to interpret.
1) nr_throttled is the number of intervals where *any* throttling occurred
2) throttled_time is the aggregate time *across all runnable tasks* (tasks in 
the Linux sense).

For example, in a typical 60 second sampling interval: nr_periods = 600, 
nr_throttled could be 60, i.e., 10% of intervals, but throttled_time could be 
much higher than (60/600) * 60 = 6 seconds if there is more than one task that 
is runnable but throttled. *Each* throttled task contributes to the total 
throttled time.

Small test to demonstrate throttled_time > nr_periods * quota_interval:

5 x {{'openssl speed'}} running with quota=100ms:
{noformat}
cat cpu.stat && sleep 1 && cat cpu.stat
nr_periods 3228
nr_throttled 1276
throttled_time 528843772540
nr_periods 3238
nr_throttled 1286
throttled_time 531668964667
{noformat}
All 10 intervals throttled (100%) for total time of 2.8 seconds in 1 second 
("more than 100%" of the time interval)


It would be helpful to expose the number and state of tasks in the container 
cgroup. This would be at a very coarse granularity but would give some guidance.


> Expose number of processes and threads in a container
> -----------------------------------------------------
>
>                 Key: MESOS-2103
>                 URL: https://issues.apache.org/jira/browse/MESOS-2103
>             Project: Mesos
>          Issue Type: Improvement
>          Components: isolation
>    Affects Versions: 0.20.0
>            Reporter: Ian Downes
>            Assignee: Chi Zhang
>              Labels: twitter
>
> The CFS cpu statistics (cpus_nr_throttled, cpus_nr_periods, 
> cpus_throttled_time) are difficult to interpret.
> 1) nr_throttled is the number of intervals where *any* throttling occurred
> 2) throttled_time is the aggregate time *across all runnable tasks* (tasks in 
> the Linux sense).
> For example, in a typical 60 second sampling interval: nr_periods = 600, 
> nr_throttled could be 60, i.e., 10% of intervals, but throttled_time could be 
> much higher than (60/600) * 60 = 6 seconds if there is more than one task 
> that is runnable but throttled. *Each* throttled task contributes to the 
> total throttled time.
> Small test to demonstrate throttled_time > nr_periods * quota_interval:
> 5 x {{'openssl speed'}} running with quota=100ms:
> {noformat}
> cat cpu.stat && sleep 1 && cat cpu.stat
> nr_periods 3228
> nr_throttled 1276
> throttled_time 528843772540
> nr_periods 3238
> nr_throttled 1286
> throttled_time 531668964667
> {noformat}
> All 10 intervals throttled (100%) for total time of 2.8 seconds in 1 second 
> ("more than 100%" of the time interval)
> It would be helpful to expose the number of processes and tasks in the 
> container cgroup. This would be at a very coarse granularity but would give 
> some guidance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2103) Expose number of processes and threads in a container

Reply via email to