subject:"\[PATCH v3 0\/4\] sched\/fair\: Burstable CFS bandwidth controller"

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-03-12 Thread changhuaixin




> On Mar 10, 2021, at 7:11 PM, Odin Ugedal  wrote:
> 
> Hi,
> 
>> If there are cases where the "start bandwidth" matters, I think there is 
>> need to expose the
>> "start bandwidth" explicitly too. However, I doubt the existence of such 
>> cases from my view
>> and the two examples above.
> 
> Yeah, I don't think there will be any cases where users will be
> "depending" on having burst available,
> so I agree in that sense.
> 
>> In my thoughts, this patchset keeps cgroup usage within the quota in the 
>> longer term, and allows
>> cgroup to respond to a burst of work with the help of a reasonable burst 
>> buffer. If quota is set correctly
>> above average usage, and enough burst buffer is set to meet the needs of 
>> bursty work. In this
>> case, it makes no difference whether this cgroup runs with 0 start bandwidth 
>> or all of it.
>> Thus I used sysctl_sched_cfs_bw_burst_onset_percent to decided the start 
>> bandwidth
>> to leave some convenience here. If this sysctl interface is confusing, I 
>> wonder whether it
>> is a good idea not to expose this interface.
>> 
>> For the first case mentioned above, if Kubernet users care the "start 
>> bandwidth" for process startup,
>> maybe it is better to give all of it rather than a part?
> 
> Yeah, I am a bit afraid there will be some confusion, so not sure if
> the sysctl is the best way to do it.
> 
> But I would like feedback from others to highlight the problem as
> well, that would be helpful. I think a simple "API"
> where you get 0 burst or full burst on "set" (the one we decide on)
> would be best to avoid unnecessary complexity.
> 
> Start burst when starting up a new process in a new cgroup might be
> helpful, so maybe that is a vote for
> full burst? However, in long term that doesn't matter, so 0 burst on
> start would work as well.
> 
>> For the second case with quota changes over time, I think it is important 
>> making sure each change works
>> long enough to enforce average quota limit. Does it really matter to control 
>> "start burst" on each change?
> 
> No, I don't think so. Doing so would be another thing to set per
> cgroup, and that would just clutter the api
> more than necessary imo., since we cannot come up with any real use cases.
> 
>> It is an copy of runtime at period start, and used to calculate burst time 
>> during a period.
>> Not quite remaining_runtime_prev_period.
> 
> Ahh, I see, I misunderstood the code. So in a essence it is
> "runtime_at_period_start"?
> 

Yes, it is "runtime_at_preiod_start".

>> Yeah, there is the updating problem. It is okey not to expose cfs_b->runtime 
>> then.
> 
> Yeah, I think dropping it all together is the best solution.
> 
> 
>> This comment does not mean any loss any unnecessary throttle for present 
>> cfsb.
>> All this means is that all quota refilling that is not done during timer 
>> stop should be
>> refilled on timer start, for the burstable cfsb.
>> 
>> Maybe I shall change this comment in some way if it is misleading?
> 
> I think I formulated my question badly. The comment makes sense, I am
> just trying to compare how "start_cfs_bandwidth"
> works after your patch compared to how it works currently. As I
> understand, without this patch "start_cfs_bandwidth" will
> never refill runtime, while with your patch, it will refill even when
> overrun=0 with burst disabled. Is that an intended change in
> behavior, or am I not understanding the patch?
> 

Good point. The way "start_cfs_bandwidth" works is changed indeed. The present 
cfs_b doesn't
have to refill bandwidth because quota is not used during the period before 
timer stops. With this patch,
runtime is refilled no matter burst is enabled or not. Do you suggest not 
refilling runtime unless burst
is enabled here?

> 
> On another note, I have also been testing this patch, and I am not
> able to reproduce your schbench results. Both with and without burst,
> it gets the same result, and no nr_throttled stays at 0 (tested on a
> 32-core system). Can you try to rerun your tests with the mainline
> to see if you still get the same results? (Also, I see you are running
> with 30 threads. How many cores do your test setup have?). To actually
> say that the result is real, all cores used should maybe be
> exclusively reserved as well, to avoid issues where other processes
> cause a
> spike in latency.
> 

Spikes indeed cause trouble. If nr_throttle stays at 0, I suggest change quota 
from 70 to 60,
which is still above the average utilization 500%. I have rerun on a 64-core 
system and reproduced the
results. And I think it should work on a 32-core system too, as there are 20 
active workers in each round.

If you still have trouble, I suggest test in the following way. And it should 
work on a two-core system.

mkdir /sys/fs/cgroup/cpu/test
echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs
echo 10 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us
echo 30 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us

./schbench -m 1 -t 3 -r

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-03-10 Thread Peter Zijlstra

On Tue, Jan 26, 2021 at 06:18:59PM +0800, changhuaixin wrote:
> Ping, any new comments on this patchset? If there're no other
> concerns, I think it's ready to be merged?

I only found this by accident...

Thread got lost because you're posting new series as replies to older
series. Please don't do that, it's crap.

I'll go have a look.

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-03-10 Thread Odin Ugedal

Hi,

> If there are cases where the "start bandwidth" matters, I think there is need 
> to expose the
> "start bandwidth" explicitly too. However, I doubt the existence of such 
> cases from my view
> and the two examples above.

Yeah, I don't think there will be any cases where users will be
"depending" on having burst available,
so I agree in that sense.

> In my thoughts, this patchset keeps cgroup usage within the quota in the 
> longer term, and allows
> cgroup to respond to a burst of work with the help of a reasonable burst 
> buffer. If quota is set correctly
> above average usage, and enough burst buffer is set to meet the needs of 
> bursty work. In this
> case, it makes no difference whether this cgroup runs with 0 start bandwidth 
> or all of it.
> Thus I used sysctl_sched_cfs_bw_burst_onset_percent to decided the start 
> bandwidth
> to leave some convenience here. If this sysctl interface is confusing, I 
> wonder whether it
> is a good idea not to expose this interface.
>
> For the first case mentioned above, if Kubernet users care the "start 
> bandwidth" for process startup,
> maybe it is better to give all of it rather than a part?

Yeah, I am a bit afraid there will be some confusion, so not sure if
the sysctl is the best way to do it.

But I would like feedback from others to highlight the problem as
well, that would be helpful. I think a simple "API"
where you get 0 burst or full burst on "set" (the one we decide on)
would be best to avoid unnecessary complexity.

Start burst when starting up a new process in a new cgroup might be
helpful, so maybe that is a vote for
full burst? However, in long term that doesn't matter, so 0 burst on
start would work as well.

> For the second case with quota changes over time, I think it is important 
> making sure each change works
> long enough to enforce average quota limit. Does it really matter to control 
> "start burst" on each change?

No, I don't think so. Doing so would be another thing to set per
cgroup, and that would just clutter the api
more than necessary imo., since we cannot come up with any real use cases.

> It is an copy of runtime at period start, and used to calculate burst time 
> during a period.
> Not quite remaining_runtime_prev_period.

Ahh, I see, I misunderstood the code. So in a essence it is
"runtime_at_period_start"?

> Yeah, there is the updating problem. It is okey not to expose cfs_b->runtime 
> then.

Yeah, I think dropping it all together is the best solution.


> This comment does not mean any loss any unnecessary throttle for present cfsb.
> All this means is that all quota refilling that is not done during timer stop 
> should be
> refilled on timer start, for the burstable cfsb.
>
> Maybe I shall change this comment in some way if it is misleading?

I think I formulated my question badly. The comment makes sense, I am
just trying to compare how "start_cfs_bandwidth"
works after your patch compared to how it works currently. As I
understand, without this patch "start_cfs_bandwidth" will
never refill runtime, while with your patch, it will refill even when
overrun=0 with burst disabled. Is that an intended change in
behavior, or am I not understanding the patch?


On another note, I have also been testing this patch, and I am not
able to reproduce your schbench results. Both with and without burst,
it gets the same result, and no nr_throttled stays at 0 (tested on a
32-core system). Can you try to rerun your tests with the mainline
to see if you still get the same results? (Also, I see you are running
with 30 threads. How many cores do your test setup have?). To actually
say that the result is real, all cores used should maybe be
exclusively reserved as well, to avoid issues where other processes
cause a
spike in latency.


Odin

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-02-27 Thread changhuaixin

Hi，

Sorry for my late reply.

> On Feb 9, 2021, at 9:17 PM, Odin Ugedal  wrote:
> 
> 
> Hi! This looks quite useful, but I have a few quick thoughts. :)
> 
> I know of a lot of people who would love this (especially some
> Kubernetes users)! I really like how this allow users to use cfs
> in a more dynamic and flexible way, without interfering with those
> who like the enforce strict quotas.
> 
> 
>> +++ b/kernel/sched/core.c
>> @ -7900,7 +7910,7 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, 
>> u64 period, u64
>> [...]
>> +/* burst_onset needed */
>> +if (cfs_b->quota != RUNTIME_INF &&
>> +sysctl_sched_cfs_bw_burst_enabled &&
>> +sysctl_sched_cfs_bw_burst_onset_percent > 0) {
>> +
>> +burst_onset = do_div(burst, 100) *
>> +sysctl_sched_cfs_bw_burst_onset_percent;
>> +
>> +cfs_b->runtime += burst_onset;
>> +cfs_b->runtime = min(max_cfs_runtime, cfs_b->runtime);
>> +}
> 
> I saw a comment about this behavior, but I think this can lead to a bit of
> confusion. If sysctl_sched_cfs_bw_burst_onset_percent=0, the amount of
> bandwidth when the first process starts up will depend on the time between
> the quota was set and the startup of the process, and that feel a bit like
> a "timing" race that end user application then will have to think about.
> 
> I suspect contianer runtimes and/or tools like Kubernetes will then have
> to tell users to set the value to a certan amount in order to make it
> work as expected.
> 
> Another thing is that when a cgroup has saved some time into the
> "burst quota", updating the quota, period or burst will then reset the
> "burst quota", even though eg. only the burst was changed. Some tools
> use dynamic quotas, resulting in multiple changes in the quota over time,
> and I am a bit scared that don't allowing them to control "start burst"
> on a write can be limiting.
> 
> Maybe we can allow people to set the "start bandwidth" explicitly when setting
> cfs_burst if they want to do that? (edit: that might be hard for cgroup v1, 
> but
> would I think that is a good solution on cgroup v2).
> 
> This is however just my thoughts, and I am not 100% sure about what the
> best solution is, but if we merge a certain behavior, we have no real
> chance of changing it later.
> 

If there are cases where the "start bandwidth" matters, I think there is need 
to expose the
"start bandwidth" explicitly too. However, I doubt the existence of such cases 
from my view
and the two examples above.

In my thoughts, this patchset keeps cgroup usage within the quota in the longer 
term, and allows 
cgroup to respond to a burst of work with the help of a reasonable burst 
buffer. If quota is set correctly
above average usage, and enough burst buffer is set to meet the needs of bursty 
work. In this
case, it makes no difference whether this cgroup runs with 0 start bandwidth or 
all of it.
Thus I used sysctl_sched_cfs_bw_burst_onset_percent to decided the start 
bandwidth
to leave some convenience here. If this sysctl interface is confusing, I wonder 
whether it
is a good idea not to expose this interface.

For the first case mentioned above, if Kubernet users care the "start 
bandwidth" for process startup,
maybe it is better to give all of it rather than a part?

For the second case with quota changes over time, I think it is important 
making sure each change works
long enough to enforce average quota limit. Does it really matter to control 
"start burst" on each change?

> 
>> +++ b/kernel/sched/sched.h
>> @@ -367,6 +367,7 @@ struct cfs_bandwidth {
>>  u64 burst;
>>  u64 buffer;
>>  u64 max_overrun;
>> +u64 previous_runtime;
>>  s64 hierarchical_quota;
> 
> Maybe indicate that this was the remaining runtime _after_ the previous
> period ended? Not 100% sure, but maybe sometihing like
> 'remaining_runtime_prev_period' or 'end_runtime_prev_period'(as inspiration). 
>   
> 

It is an copy of runtime at period start, and used to calculate burst time 
during a period.
Not quite remaining_runtime_prev_period.

> 
>> +++ b/kernel/sched/core.c
>> @@ -8234,6 +8236,10 @@ static int cpu_cfs_stat_show(struct seq_file *sf, 
>> void *v)
>>  seq_printf(sf, "wait_sum %llu\n", ws);
>>  }
>> 
>> +seq_printf(sf, "current_bw %llu\n", cfs_b->runtime);
>> +seq_printf(sf, "nr_burst %d\n", cfs_b->nr_burst);
>> +seq_printf(sf, "burst_time %llu\n", cfs_b->burst_time);
>> +
>>  return 0;
>> }
> 
> Looks like these metrics are missing from the cgroup v2 stats.
> 

Thanks, cgroup v2 stats and documentation should be handled. I will add them
in the following patchset.

> Are we sure it is smart to start exposing cfs_b->runtime, since it makes it
> harder to change the implementation at a later time? I don't thin it is that
> usefull, and if it is only exposed

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-02-09 Thread Tejun Heo

Hello,

On Tue, Feb 09, 2021 at 02:17:19PM +0100, Odin Ugedal wrote:
> A am not that familiar how cross subsystem patches like these are handled, but
> I am still adding the Tejun Heo (cgroup maintainer) as a CC. Should maybe cc 
> to
> cgroup@ as well?

Yeah, that'd be great. Given that it's mostly straight forward extension on
an existing interface, things looks fine from cgroup side; however, please
do add cgroup2 interface and documentation. One thing which has bene
bothersome about the bandwidth interface is that we're exposing
implementation details (window size and now burst size) instead of
usage-centric requirements but that boat has already sailed, so...

Thanks.

-- 
tejun

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-02-09 Thread Odin Ugedal



Hi! This looks quite useful, but I have a few quick thoughts. :)

I know of a lot of people who would love this (especially some
Kubernetes users)! I really like how this allow users to use cfs
in a more dynamic and flexible way, without interfering with those
who like the enforce strict quotas.


> +++ b/kernel/sched/core.c
> @ -7900,7 +7910,7 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, 
> u64 period, u64
> [...]
> + /* burst_onset needed */
> + if (cfs_b->quota != RUNTIME_INF &&
> + sysctl_sched_cfs_bw_burst_enabled &&
> + sysctl_sched_cfs_bw_burst_onset_percent > 0) {
> +
> + burst_onset = do_div(burst, 100) *
> + sysctl_sched_cfs_bw_burst_onset_percent;
> +
> + cfs_b->runtime += burst_onset;
> + cfs_b->runtime = min(max_cfs_runtime, cfs_b->runtime);
> + }

I saw a comment about this behavior, but I think this can lead to a bit of
confusion. If sysctl_sched_cfs_bw_burst_onset_percent=0, the amount of
bandwidth when the first process starts up will depend on the time between
the quota was set and the startup of the process, and that feel a bit like
a "timing" race that end user application then will have to think about.

I suspect contianer runtimes and/or tools like Kubernetes will then have
to tell users to set the value to a certan amount in order to make it
work as expected.

Another thing is that when a cgroup has saved some time into the
"burst quota", updating the quota, period or burst will then reset the
"burst quota", even though eg. only the burst was changed. Some tools
use dynamic quotas, resulting in multiple changes in the quota over time,
and I am a bit scared that don't allowing them to control "start burst"
on a write can be limiting.

Maybe we can allow people to set the "start bandwidth" explicitly when setting
cfs_burst if they want to do that? (edit: that might be hard for cgroup v1, but
would I think that is a good solution on cgroup v2).

This is however just my thoughts, and I am not 100% sure about what the
best solution is, but if we merge a certain behavior, we have no real
chance of changing it later.


> +++ b/kernel/sched/sched.h
> @@ -367,6 +367,7 @@ struct cfs_bandwidth {
>   u64 burst;
>   u64 buffer;
>   u64 max_overrun;
> + u64 previous_runtime;
>   s64 hierarchical_quota;

Maybe indicate that this was the remaining runtime _after_ the previous
period ended? Not 100% sure, but maybe sometihing like
'remaining_runtime_prev_period' or 'end_runtime_prev_period'(as inspiration).   


> +++ b/kernel/sched/core.c
> @@ -8234,6 +8236,10 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void 
> *v)
>   seq_printf(sf, "wait_sum %llu\n", ws);
>   }
>  
> + seq_printf(sf, "current_bw %llu\n", cfs_b->runtime);
> + seq_printf(sf, "nr_burst %d\n", cfs_b->nr_burst);
> + seq_printf(sf, "burst_time %llu\n", cfs_b->burst_time);
> +
>   return 0;
>  }

Looks like these metrics are missing from the cgroup v2 stats.

Are we sure it is smart to start exposing cfs_b->runtime, since it makes it
harder to change the implementation at a later time? I don't thin it is that
usefull, and if it is only exposed for debugging purposes people can probably
use kprobes instead? Also, it would not be usefull unless you know how much
wall time is left in the current period. In that sense,
cfs_b->previous_runtime would probably be more usefull, but still not sure if
it deserves to be exposed to end users like this.

Also, will "cfs_b->runtime" keep updating if no processes are running, or
will it be the the same here, but update (with burst via timer overrun)
when a process starts again? If so, the runtime available when a process
starts on cgroup inint can be hard to communicate if the value here doesn't
update.


> +++ b/kernel/sched/fair.c
> +void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b, int init)
> [...]
> + /*
> +  * When period timer stops, quota for the following period is not
> +  * refilled, however period timer is already forwarded. We should
> +  * accumulate quota once more than overrun here.
> +  */


Trying to wrap my head around this one... Is not refilling here, as the
behavior before your patch causing "loss" in runtime and causing unnecessary
possibly causing a cgroup throttle?.


A am not that familiar how cross subsystem patches like these are handled, but
I am still adding the Tejun Heo (cgroup maintainer) as a CC. Should maybe cc to
cgroup@ as well?

Sorry for a long mail, in retrospect it should have been one per patch...

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-01-26 Thread changhuaixin




> On Jan 21, 2021, at 7:04 PM, Huaixin Chang  
> wrote:
> 
> Changelog
> 
> v3:
> 1. Fix another issue reported by test robot.
> 2. Update docs as Randy Dunlap suggested.
> 
> v2:
> 1. Fix an issue reported by test robot.
> 2. Rewriting docs. Appreciate any further suggestions or help.
> 
> The CFS bandwidth controller limits CPU requests of a task group to
> quota during each period. However, parallel workloads might be bursty
> so that they get throttled. And they are latency sensitive at the same
> time so that throttling them is undesired.
> 
> Scaling up period and quota allows greater burst capacity. But it might
> cause longer stuck till next refill. We introduce "burst" to allow
> accumulating unused quota from previous periods, and to be assigned when
> a task group requests more CPU than quota during a specific period. Thus
> allowing CPU time requests as long as the average requested CPU time is
> below quota on the long run. The maximum accumulation is capped by burst
> and is set 0 by default, thus the traditional behaviour remains.
> 
> A huge drop of 99th tail latency from more than 500ms to 27ms is seen for
> real java workloads when using burst. Similar drops are seen when
> testing with schbench too:
> 
>   echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs
>   echo 70 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us
>   echo 10 > /sys/fs/cgroup/cpu/test/cpu.cfs_period_us
>   echo 40 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us
> 
>   # The average CPU usage is around 500%, which is 200ms CPU time
>   # every 40ms.
>   ./schbench -m 1 -t 30 -r 60 -c 1 -R 500
> 
>   Without burst:
> 
>   Latency percentiles (usec)
>   50.th: 7
>   75.th: 8
>   90.th: 9
>   95.th: 10
>   *99.th: 933
>   99.5000th: 981
>   99.9000th: 3068
>   min=0, max=20054
>   rps: 498.31 p95 (usec) 10 p99 (usec) 933 p95/cputime 0.10% p99/cputime 
> 9.33%
> 
>   With burst:
> 
>   Latency percentiles (usec)
>   50.th: 7
>   75.th: 8
>   90.th: 9
>   95.th: 9
>   *99.th: 12
>   99.5000th: 13
>   99.9000th: 19
>   min=0, max=406
>   rps: 498.36 p95 (usec) 9 p99 (usec) 12 p95/cputime 0.09% p99/cputime 
> 0.12%
> 
> How much workloads with benefit from burstable CFS bandwidth control
> depends on how bursty and how latency sensitive they are.
> 
> Previously, Cong Wang and Konstantin Khlebnikov proposed similar
> feature:
> https://lore.kernel.org/lkml/20180522062017.5193-1-xiyou.wangc...@gmail.com/
> https://lore.kernel.org/lkml/157476581065.5793.4518979877345136813.stgit@buzz/
> 
> This time we present more latency statistics and handle overflow while
> accumulating.
> 
> Huaixin Chang (4):
>  sched/fair: Introduce primitives for CFS bandwidth burst
>  sched/fair: Make CFS bandwidth controller burstable
>  sched/fair: Add cfs bandwidth burst statistics
>  sched/fair: Add document for burstable CFS bandwidth control
> 
> Documentation/scheduler/sched-bwc.rst |  49 +++--
> include/linux/sched/sysctl.h  |   2 +
> kernel/sched/core.c   | 126 +-
> kernel/sched/fair.c   |  58 +---
> kernel/sched/sched.h  |   9 ++-
> kernel/sysctl.c   |  18 +
> 6 files changed, 232 insertions(+), 30 deletions(-)
> 
> -- 
> 2.14.4.44.g2045bb6

Ping, any new comments on this patchset? If there're no other concerns, I think 
it's ready to be merged?

[PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

2021-01-21 Thread Huaixin Chang

Changelog

v3:
1. Fix another issue reported by test robot.
2. Update docs as Randy Dunlap suggested.

v2:
1. Fix an issue reported by test robot.
2. Rewriting docs. Appreciate any further suggestions or help.

The CFS bandwidth controller limits CPU requests of a task group to
quota during each period. However, parallel workloads might be bursty
so that they get throttled. And they are latency sensitive at the same
time so that throttling them is undesired.

Scaling up period and quota allows greater burst capacity. But it might
cause longer stuck till next refill. We introduce "burst" to allow
accumulating unused quota from previous periods, and to be assigned when
a task group requests more CPU than quota during a specific period. Thus
allowing CPU time requests as long as the average requested CPU time is
below quota on the long run. The maximum accumulation is capped by burst
and is set 0 by default, thus the traditional behaviour remains.

A huge drop of 99th tail latency from more than 500ms to 27ms is seen for
real java workloads when using burst. Similar drops are seen when
testing with schbench too:

echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs
echo 70 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us
echo 10 > /sys/fs/cgroup/cpu/test/cpu.cfs_period_us
echo 40 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us

# The average CPU usage is around 500%, which is 200ms CPU time
# every 40ms.
./schbench -m 1 -t 30 -r 60 -c 1 -R 500

Without burst:

Latency percentiles (usec)
50.th: 7
75.th: 8
90.th: 9
95.th: 10
*99.th: 933
99.5000th: 981
99.9000th: 3068
min=0, max=20054
rps: 498.31 p95 (usec) 10 p99 (usec) 933 p95/cputime 0.10% p99/cputime 
9.33%

With burst:

Latency percentiles (usec)
50.th: 7
75.th: 8
90.th: 9
95.th: 9
*99.th: 12
99.5000th: 13
99.9000th: 19
min=0, max=406
rps: 498.36 p95 (usec) 9 p99 (usec) 12 p95/cputime 0.09% p99/cputime 
0.12%

How much workloads with benefit from burstable CFS bandwidth control
depends on how bursty and how latency sensitive they are.

Previously, Cong Wang and Konstantin Khlebnikov proposed similar
feature:
https://lore.kernel.org/lkml/20180522062017.5193-1-xiyou.wangc...@gmail.com/
https://lore.kernel.org/lkml/157476581065.5793.4518979877345136813.stgit@buzz/

This time we present more latency statistics and handle overflow while
accumulating.

Huaixin Chang (4):
  sched/fair: Introduce primitives for CFS bandwidth burst
  sched/fair: Make CFS bandwidth controller burstable
  sched/fair: Add cfs bandwidth burst statistics
  sched/fair: Add document for burstable CFS bandwidth control

 Documentation/scheduler/sched-bwc.rst |  49 +++--
 include/linux/sched/sysctl.h  |   2 +
 kernel/sched/core.c   | 126 +-
 kernel/sched/fair.c   |  58 +---
 kernel/sched/sched.h  |   9 ++-
 kernel/sysctl.c   |  18 +
 6 files changed, 232 insertions(+), 30 deletions(-)

-- 
2.14.4.44.g2045bb6

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

[PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

8 matches

Site Navigation

Mail list logo

Footer information