Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-12 Thread Suren Baghdasaryan
On Wed, Sep 12, 2018 at 5:51 AM, Patrick Bellasi
 wrote:
> On 08-Sep 20:02, Suren Baghdasaryan wrote:
>> On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
>>  wrote:
>
> [...]
>
>> > +  cpu.util.min.effective
>> > +A read-only single value file which exists on non-root cgroups and
>> > +reports minimum utilization clamp value currently enforced on a 
>> > task
>> > +group.
>> > +
>> > +The actual minimum utilization in the range [0, 1023].
>> > +
>> > +This value can be lower then cpu.util.min in case a parent cgroup
>> > +is enforcing a more restrictive clamping on minimum utilization.
>>
>> IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
>> calling parent's lower cpu.util.min value "more restrictive clamping"
>> is confusing. I would suggest to rephrase this to smth like "...in
>> case a parent cgroup requires lower cpu.util.min clamping."
>
> Right, it's slightly confusing... still I would like to call out that
> a parent group can enforce something on its children. What about:
>
>"... a parent cgroup allows only smaller minimum utilization values."
>
> Is that less confusing ?

SGTM.

>
> Otherwise I think your proposal could work too.
>
> [...]
>
>> >  #ifdef CONFIG_UCLAMP_TASK_GROUP
>> > +/**
>> > + * cpu_util_update_hier: propagete effective clamp down the hierarchy
>>
>> typo: propagate
>
> +1
>
> [...]
>
>> > +* Skip the whole subtrees if the current effective clamp 
>> > is
>> > +* alredy matching the TG's clamp value.
>>
>> typo: already
>
> +1
>
>
> Cheers,
> Patrick
>
> --
> #include 
>
> Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-12 Thread Suren Baghdasaryan
On Wed, Sep 12, 2018 at 5:51 AM, Patrick Bellasi
 wrote:
> On 08-Sep 20:02, Suren Baghdasaryan wrote:
>> On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
>>  wrote:
>
> [...]
>
>> > +  cpu.util.min.effective
>> > +A read-only single value file which exists on non-root cgroups and
>> > +reports minimum utilization clamp value currently enforced on a 
>> > task
>> > +group.
>> > +
>> > +The actual minimum utilization in the range [0, 1023].
>> > +
>> > +This value can be lower then cpu.util.min in case a parent cgroup
>> > +is enforcing a more restrictive clamping on minimum utilization.
>>
>> IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
>> calling parent's lower cpu.util.min value "more restrictive clamping"
>> is confusing. I would suggest to rephrase this to smth like "...in
>> case a parent cgroup requires lower cpu.util.min clamping."
>
> Right, it's slightly confusing... still I would like to call out that
> a parent group can enforce something on its children. What about:
>
>"... a parent cgroup allows only smaller minimum utilization values."
>
> Is that less confusing ?

SGTM.

>
> Otherwise I think your proposal could work too.
>
> [...]
>
>> >  #ifdef CONFIG_UCLAMP_TASK_GROUP
>> > +/**
>> > + * cpu_util_update_hier: propagete effective clamp down the hierarchy
>>
>> typo: propagate
>
> +1
>
> [...]
>
>> > +* Skip the whole subtrees if the current effective clamp 
>> > is
>> > +* alredy matching the TG's clamp value.
>>
>> typo: already
>
> +1
>
>
> Cheers,
> Patrick
>
> --
> #include 
>
> Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-12 Thread Patrick Bellasi
On 08-Sep 20:02, Suren Baghdasaryan wrote:
> On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
>  wrote:

[...]

> > +  cpu.util.min.effective
> > +A read-only single value file which exists on non-root cgroups and
> > +reports minimum utilization clamp value currently enforced on a 
> > task
> > +group.
> > +
> > +The actual minimum utilization in the range [0, 1023].
> > +
> > +This value can be lower then cpu.util.min in case a parent cgroup
> > +is enforcing a more restrictive clamping on minimum utilization.
> 
> IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
> calling parent's lower cpu.util.min value "more restrictive clamping"
> is confusing. I would suggest to rephrase this to smth like "...in
> case a parent cgroup requires lower cpu.util.min clamping."

Right, it's slightly confusing... still I would like to call out that
a parent group can enforce something on its children. What about:

   "... a parent cgroup allows only smaller minimum utilization values."

Is that less confusing ?

Otherwise I think your proposal could work too.

[...]

> >  #ifdef CONFIG_UCLAMP_TASK_GROUP
> > +/**
> > + * cpu_util_update_hier: propagete effective clamp down the hierarchy
> 
> typo: propagate

+1

[...]

> > +* Skip the whole subtrees if the current effective clamp is
> > +* alredy matching the TG's clamp value.
> 
> typo: already

+1


Cheers,
Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-12 Thread Patrick Bellasi
On 08-Sep 20:02, Suren Baghdasaryan wrote:
> On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
>  wrote:

[...]

> > +  cpu.util.min.effective
> > +A read-only single value file which exists on non-root cgroups and
> > +reports minimum utilization clamp value currently enforced on a 
> > task
> > +group.
> > +
> > +The actual minimum utilization in the range [0, 1023].
> > +
> > +This value can be lower then cpu.util.min in case a parent cgroup
> > +is enforcing a more restrictive clamping on minimum utilization.
> 
> IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
> calling parent's lower cpu.util.min value "more restrictive clamping"
> is confusing. I would suggest to rephrase this to smth like "...in
> case a parent cgroup requires lower cpu.util.min clamping."

Right, it's slightly confusing... still I would like to call out that
a parent group can enforce something on its children. What about:

   "... a parent cgroup allows only smaller minimum utilization values."

Is that less confusing ?

Otherwise I think your proposal could work too.

[...]

> >  #ifdef CONFIG_UCLAMP_TASK_GROUP
> > +/**
> > + * cpu_util_update_hier: propagete effective clamp down the hierarchy
> 
> typo: propagate

+1

[...]

> > +* Skip the whole subtrees if the current effective clamp is
> > +* alredy matching the TG's clamp value.
> 
> typo: already

+1


Cheers,
Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Tejun Heo
Hello, Patrick.

On Tue, Sep 11, 2018 at 05:26:24PM +0100, Patrick Bellasi wrote:
> My question is: IF the scheduler maintainers are going to be happy
> with the overall design for the core bits, are you happy to start the
> review of the cgroups bits before the core ones are (eventually) merged?

Yeah, sure, once the feature is more or less agreed on the scheduler
core side, we can delve into how it should be represented in cgroup.

Thanks.

-- 
tejun


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Tejun Heo
Hello, Patrick.

On Tue, Sep 11, 2018 at 05:26:24PM +0100, Patrick Bellasi wrote:
> My question is: IF the scheduler maintainers are going to be happy
> with the overall design for the core bits, are you happy to start the
> review of the cgroups bits before the core ones are (eventually) merged?

Yeah, sure, once the feature is more or less agreed on the scheduler
core side, we can delve into how it should be represented in cgroup.

Thanks.

-- 
tejun


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Patrick Bellasi
On 11-Sep 08:18, Tejun Heo wrote:
> Hello, Patrick.

Hi Tejun,

> Can we first concentrate on getting in the non-cgroup part first?

That's the reason why I've reordered (as per your request) the series
to have all the core and non-cgroup related bits at the beginning.

There are a couple of patches at the end of this series which can be
anticipated but, apart from those, the cgroup code is very well
self-contained within patches 7-12.

> The feature has to make sense without cgroup too

Indeed, this is what I worked on since you pointed out in v1 that
there must be a meaningful non-cgroup API and that's what we have
since v2.

> and I think it'd be a lot easier to discuss cgroup details once the
> scheduler core side is settled.

IMHO, developing the cgroup interface on top of the core bits is quite
important to ensure that we have effective data structures and
implementation which can satisfy both worlds.

My question is: IF the scheduler maintainers are going to be happy
with the overall design for the core bits, are you happy to start the
review of the cgroups bits before the core ones are (eventually) merged?

Cheers,
Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Patrick Bellasi
On 11-Sep 08:18, Tejun Heo wrote:
> Hello, Patrick.

Hi Tejun,

> Can we first concentrate on getting in the non-cgroup part first?

That's the reason why I've reordered (as per your request) the series
to have all the core and non-cgroup related bits at the beginning.

There are a couple of patches at the end of this series which can be
anticipated but, apart from those, the cgroup code is very well
self-contained within patches 7-12.

> The feature has to make sense without cgroup too

Indeed, this is what I worked on since you pointed out in v1 that
there must be a meaningful non-cgroup API and that's what we have
since v2.

> and I think it'd be a lot easier to discuss cgroup details once the
> scheduler core side is settled.

IMHO, developing the cgroup interface on top of the core bits is quite
important to ensure that we have effective data structures and
implementation which can satisfy both worlds.

My question is: IF the scheduler maintainers are going to be happy
with the overall design for the core bits, are you happy to start the
review of the cgroups bits before the core ones are (eventually) merged?

Cheers,
Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Tejun Heo
Hello, Patrick.

Can we first concentrate on getting in the non-cgroup part first?  The
feature has to make sense without cgroup too and I think it'd be a lot
easier to discuss cgroup details once the scheduler core side is
settled.

Thanks.

-- 
tejun


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-11 Thread Tejun Heo
Hello, Patrick.

Can we first concentrate on getting in the non-cgroup part first?  The
feature has to make sense without cgroup too and I think it'd be a lot
easier to discuss cgroup details once the scheduler core side is
settled.

Thanks.

-- 
tejun


Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-08 Thread Suren Baghdasaryan
On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
 wrote:
> In order to properly support hierarchical resources control, the cgroup
> delegation model requires that attribute writes from a child group never
> fail but still are (potentially) constrained based on parent's assigned
> resources. This requires to properly propagate and aggregate parent
> attributes down to its descendants.
>
> Let's implement this mechanism by adding a new "effective" clamp value
> for each task group. The effective clamp value is defined as the smaller
> value between the clamp value of a group and the effective clamp value
> of its parent. This represent also the clamp value which is actually
> used to clamp tasks in each task group.
>
> Since it can be interesting for tasks in a cgroup to know exactly what
> is the currently propagated/enforced configuration, the effective clamp
> values are exposed to user-space by means of a new pair of read-only
> attributes: cpu.util.{min,max}.effective.
>
> Signed-off-by: Patrick Bellasi 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Tejun Heo 
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> Cc: Suren Baghdasaryan 
> Cc: Todd Kjos 
> Cc: Joel Fernandes 
> Cc: Juri Lelli 
> Cc: Quentin Perret 
> Cc: Dietmar Eggemann 
> Cc: Morten Rasmussen 
> Cc: linux-kernel@vger.kernel.org
> Cc: linux...@vger.kernel.org
>
> ---
> Changes in v4:
>  Message-ID: <20180816140731.GD2960@e110439-lin>
>  - add ".effective" attributes to the default hierarchy
>  Others:
>  - small documentation fixes
>  - rebased on v4.19-rc1
>
> Changes in v3:
>  Message-ID: <20180409222417.gk3126...@devbig577.frc2.facebook.com>
>  - new patch in v3, to implement a suggestion from v1 review
> ---
>  Documentation/admin-guide/cgroup-v2.rst |  25 +-
>  include/linux/sched.h   |   8 ++
>  kernel/sched/core.c | 112 +++-
>  3 files changed, 139 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst 
> b/Documentation/admin-guide/cgroup-v2.rst
> index 80ef7bdc517b..72272f58d304 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -976,22 +976,43 @@ All time durations are in microseconds.
>  A read-write single value file which exists on non-root cgroups.
>  The default is "0", i.e. no bandwidth boosting.
>
> -The minimum utilization in the range [0, 1023].
> +The requested minimum utilization in the range [0, 1023].
>
>  This interface allows reading and setting minimum utilization clamp
>  values similar to the sched_setattr(2). This minimum utilization
>  value is used to clamp the task specific minimum utilization clamp.
>
> +  cpu.util.min.effective
> +A read-only single value file which exists on non-root cgroups and
> +reports minimum utilization clamp value currently enforced on a task
> +group.
> +
> +The actual minimum utilization in the range [0, 1023].
> +
> +This value can be lower then cpu.util.min in case a parent cgroup
> +is enforcing a more restrictive clamping on minimum utilization.

IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
calling parent's lower cpu.util.min value "more restrictive clamping"
is confusing. I would suggest to rephrase this to smth like "...in
case a parent cgroup requires lower cpu.util.min clamping."

> +
>cpu.util.max
>  A read-write single value file which exists on non-root cgroups.
>  The default is "1023". i.e. no bandwidth clamping
>
> -The maximum utilization in the range [0, 1023].
> +The requested maximum utilization in the range [0, 1023].
>
>  This interface allows reading and setting maximum utilization clamp
>  values similar to the sched_setattr(2). This maximum utilization
>  value is used to clamp the task specific maximum utilization clamp.
>
> +  cpu.util.max.effective
> +A read-only single value file which exists on non-root cgroups and
> +reports maximum utilization clamp value currently enforced on a task
> +group.
> +
> +The actual maximum utilization in the range [0, 1023].
> +
> +This value can be lower then cpu.util.max in case a parent cgroup
> +is enforcing a more restrictive clamping on max utilization.
> +
> +
>  Memory
>  --
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index dc39b67a366a..2da130d17e70 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -591,6 +591,14 @@ struct sched_dl_entity {
>  struct uclamp_se {
> unsigned int value;
> unsigned int group_id;
> +   /*
> +* Effective task (group) clamp value.
> +* For task groups is the value (eventually) enforced by a parent task
> +* group.
> +*/
> +   struct {
> +   unsigned int value;
> +   } effective;
>  };
>
>  union 

Re: [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-09-08 Thread Suren Baghdasaryan
On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
 wrote:
> In order to properly support hierarchical resources control, the cgroup
> delegation model requires that attribute writes from a child group never
> fail but still are (potentially) constrained based on parent's assigned
> resources. This requires to properly propagate and aggregate parent
> attributes down to its descendants.
>
> Let's implement this mechanism by adding a new "effective" clamp value
> for each task group. The effective clamp value is defined as the smaller
> value between the clamp value of a group and the effective clamp value
> of its parent. This represent also the clamp value which is actually
> used to clamp tasks in each task group.
>
> Since it can be interesting for tasks in a cgroup to know exactly what
> is the currently propagated/enforced configuration, the effective clamp
> values are exposed to user-space by means of a new pair of read-only
> attributes: cpu.util.{min,max}.effective.
>
> Signed-off-by: Patrick Bellasi 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Tejun Heo 
> Cc: Rafael J. Wysocki 
> Cc: Viresh Kumar 
> Cc: Suren Baghdasaryan 
> Cc: Todd Kjos 
> Cc: Joel Fernandes 
> Cc: Juri Lelli 
> Cc: Quentin Perret 
> Cc: Dietmar Eggemann 
> Cc: Morten Rasmussen 
> Cc: linux-kernel@vger.kernel.org
> Cc: linux...@vger.kernel.org
>
> ---
> Changes in v4:
>  Message-ID: <20180816140731.GD2960@e110439-lin>
>  - add ".effective" attributes to the default hierarchy
>  Others:
>  - small documentation fixes
>  - rebased on v4.19-rc1
>
> Changes in v3:
>  Message-ID: <20180409222417.gk3126...@devbig577.frc2.facebook.com>
>  - new patch in v3, to implement a suggestion from v1 review
> ---
>  Documentation/admin-guide/cgroup-v2.rst |  25 +-
>  include/linux/sched.h   |   8 ++
>  kernel/sched/core.c | 112 +++-
>  3 files changed, 139 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst 
> b/Documentation/admin-guide/cgroup-v2.rst
> index 80ef7bdc517b..72272f58d304 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -976,22 +976,43 @@ All time durations are in microseconds.
>  A read-write single value file which exists on non-root cgroups.
>  The default is "0", i.e. no bandwidth boosting.
>
> -The minimum utilization in the range [0, 1023].
> +The requested minimum utilization in the range [0, 1023].
>
>  This interface allows reading and setting minimum utilization clamp
>  values similar to the sched_setattr(2). This minimum utilization
>  value is used to clamp the task specific minimum utilization clamp.
>
> +  cpu.util.min.effective
> +A read-only single value file which exists on non-root cgroups and
> +reports minimum utilization clamp value currently enforced on a task
> +group.
> +
> +The actual minimum utilization in the range [0, 1023].
> +
> +This value can be lower then cpu.util.min in case a parent cgroup
> +is enforcing a more restrictive clamping on minimum utilization.

IMHO if cpu.util.min=0 means "no restrictions" on UCLAMP_MIN then
calling parent's lower cpu.util.min value "more restrictive clamping"
is confusing. I would suggest to rephrase this to smth like "...in
case a parent cgroup requires lower cpu.util.min clamping."

> +
>cpu.util.max
>  A read-write single value file which exists on non-root cgroups.
>  The default is "1023". i.e. no bandwidth clamping
>
> -The maximum utilization in the range [0, 1023].
> +The requested maximum utilization in the range [0, 1023].
>
>  This interface allows reading and setting maximum utilization clamp
>  values similar to the sched_setattr(2). This maximum utilization
>  value is used to clamp the task specific maximum utilization clamp.
>
> +  cpu.util.max.effective
> +A read-only single value file which exists on non-root cgroups and
> +reports maximum utilization clamp value currently enforced on a task
> +group.
> +
> +The actual maximum utilization in the range [0, 1023].
> +
> +This value can be lower then cpu.util.max in case a parent cgroup
> +is enforcing a more restrictive clamping on max utilization.
> +
> +
>  Memory
>  --
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index dc39b67a366a..2da130d17e70 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -591,6 +591,14 @@ struct sched_dl_entity {
>  struct uclamp_se {
> unsigned int value;
> unsigned int group_id;
> +   /*
> +* Effective task (group) clamp value.
> +* For task groups is the value (eventually) enforced by a parent task
> +* group.
> +*/
> +   struct {
> +   unsigned int value;
> +   } effective;
>  };
>
>  union 

[PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-08-28 Thread Patrick Bellasi
In order to properly support hierarchical resources control, the cgroup
delegation model requires that attribute writes from a child group never
fail but still are (potentially) constrained based on parent's assigned
resources. This requires to properly propagate and aggregate parent
attributes down to its descendants.

Let's implement this mechanism by adding a new "effective" clamp value
for each task group. The effective clamp value is defined as the smaller
value between the clamp value of a group and the effective clamp value
of its parent. This represent also the clamp value which is actually
used to clamp tasks in each task group.

Since it can be interesting for tasks in a cgroup to know exactly what
is the currently propagated/enforced configuration, the effective clamp
values are exposed to user-space by means of a new pair of read-only
attributes: cpu.util.{min,max}.effective.

Signed-off-by: Patrick Bellasi 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Tejun Heo 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Suren Baghdasaryan 
Cc: Todd Kjos 
Cc: Joel Fernandes 
Cc: Juri Lelli 
Cc: Quentin Perret 
Cc: Dietmar Eggemann 
Cc: Morten Rasmussen 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@vger.kernel.org

---
Changes in v4:
 Message-ID: <20180816140731.GD2960@e110439-lin>
 - add ".effective" attributes to the default hierarchy
 Others:
 - small documentation fixes
 - rebased on v4.19-rc1

Changes in v3:
 Message-ID: <20180409222417.gk3126...@devbig577.frc2.facebook.com>
 - new patch in v3, to implement a suggestion from v1 review
---
 Documentation/admin-guide/cgroup-v2.rst |  25 +-
 include/linux/sched.h   |   8 ++
 kernel/sched/core.c | 112 +++-
 3 files changed, 139 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index 80ef7bdc517b..72272f58d304 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -976,22 +976,43 @@ All time durations are in microseconds.
 A read-write single value file which exists on non-root cgroups.
 The default is "0", i.e. no bandwidth boosting.
 
-The minimum utilization in the range [0, 1023].
+The requested minimum utilization in the range [0, 1023].
 
 This interface allows reading and setting minimum utilization clamp
 values similar to the sched_setattr(2). This minimum utilization
 value is used to clamp the task specific minimum utilization clamp.
 
+  cpu.util.min.effective
+A read-only single value file which exists on non-root cgroups and
+reports minimum utilization clamp value currently enforced on a task
+group.
+
+The actual minimum utilization in the range [0, 1023].
+
+This value can be lower then cpu.util.min in case a parent cgroup
+is enforcing a more restrictive clamping on minimum utilization.
+
   cpu.util.max
 A read-write single value file which exists on non-root cgroups.
 The default is "1023". i.e. no bandwidth clamping
 
-The maximum utilization in the range [0, 1023].
+The requested maximum utilization in the range [0, 1023].
 
 This interface allows reading and setting maximum utilization clamp
 values similar to the sched_setattr(2). This maximum utilization
 value is used to clamp the task specific maximum utilization clamp.
 
+  cpu.util.max.effective
+A read-only single value file which exists on non-root cgroups and
+reports maximum utilization clamp value currently enforced on a task
+group.
+
+The actual maximum utilization in the range [0, 1023].
+
+This value can be lower then cpu.util.max in case a parent cgroup
+is enforcing a more restrictive clamping on max utilization.
+
+
 Memory
 --
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index dc39b67a366a..2da130d17e70 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -591,6 +591,14 @@ struct sched_dl_entity {
 struct uclamp_se {
unsigned int value;
unsigned int group_id;
+   /*
+* Effective task (group) clamp value.
+* For task groups is the value (eventually) enforced by a parent task
+* group.
+*/
+   struct {
+   unsigned int value;
+   } effective;
 };
 
 union rcu_special {
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dcbf22abd0bf..b2d438b6484b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1254,6 +1254,8 @@ static inline int alloc_uclamp_sched_group(struct 
task_group *tg,
 
for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
uc_se = >uclamp[clamp_id];
+   uc_se->effective.value =
+   parent->uclamp[clamp_id].effective.value;
uc_se->value = parent->uclamp[clamp_id].value;
   

[PATCH v4 08/16] sched/core: uclamp: propagate parent clamps

2018-08-28 Thread Patrick Bellasi
In order to properly support hierarchical resources control, the cgroup
delegation model requires that attribute writes from a child group never
fail but still are (potentially) constrained based on parent's assigned
resources. This requires to properly propagate and aggregate parent
attributes down to its descendants.

Let's implement this mechanism by adding a new "effective" clamp value
for each task group. The effective clamp value is defined as the smaller
value between the clamp value of a group and the effective clamp value
of its parent. This represent also the clamp value which is actually
used to clamp tasks in each task group.

Since it can be interesting for tasks in a cgroup to know exactly what
is the currently propagated/enforced configuration, the effective clamp
values are exposed to user-space by means of a new pair of read-only
attributes: cpu.util.{min,max}.effective.

Signed-off-by: Patrick Bellasi 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Tejun Heo 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Suren Baghdasaryan 
Cc: Todd Kjos 
Cc: Joel Fernandes 
Cc: Juri Lelli 
Cc: Quentin Perret 
Cc: Dietmar Eggemann 
Cc: Morten Rasmussen 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@vger.kernel.org

---
Changes in v4:
 Message-ID: <20180816140731.GD2960@e110439-lin>
 - add ".effective" attributes to the default hierarchy
 Others:
 - small documentation fixes
 - rebased on v4.19-rc1

Changes in v3:
 Message-ID: <20180409222417.gk3126...@devbig577.frc2.facebook.com>
 - new patch in v3, to implement a suggestion from v1 review
---
 Documentation/admin-guide/cgroup-v2.rst |  25 +-
 include/linux/sched.h   |   8 ++
 kernel/sched/core.c | 112 +++-
 3 files changed, 139 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index 80ef7bdc517b..72272f58d304 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -976,22 +976,43 @@ All time durations are in microseconds.
 A read-write single value file which exists on non-root cgroups.
 The default is "0", i.e. no bandwidth boosting.
 
-The minimum utilization in the range [0, 1023].
+The requested minimum utilization in the range [0, 1023].
 
 This interface allows reading and setting minimum utilization clamp
 values similar to the sched_setattr(2). This minimum utilization
 value is used to clamp the task specific minimum utilization clamp.
 
+  cpu.util.min.effective
+A read-only single value file which exists on non-root cgroups and
+reports minimum utilization clamp value currently enforced on a task
+group.
+
+The actual minimum utilization in the range [0, 1023].
+
+This value can be lower then cpu.util.min in case a parent cgroup
+is enforcing a more restrictive clamping on minimum utilization.
+
   cpu.util.max
 A read-write single value file which exists on non-root cgroups.
 The default is "1023". i.e. no bandwidth clamping
 
-The maximum utilization in the range [0, 1023].
+The requested maximum utilization in the range [0, 1023].
 
 This interface allows reading and setting maximum utilization clamp
 values similar to the sched_setattr(2). This maximum utilization
 value is used to clamp the task specific maximum utilization clamp.
 
+  cpu.util.max.effective
+A read-only single value file which exists on non-root cgroups and
+reports maximum utilization clamp value currently enforced on a task
+group.
+
+The actual maximum utilization in the range [0, 1023].
+
+This value can be lower then cpu.util.max in case a parent cgroup
+is enforcing a more restrictive clamping on max utilization.
+
+
 Memory
 --
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index dc39b67a366a..2da130d17e70 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -591,6 +591,14 @@ struct sched_dl_entity {
 struct uclamp_se {
unsigned int value;
unsigned int group_id;
+   /*
+* Effective task (group) clamp value.
+* For task groups is the value (eventually) enforced by a parent task
+* group.
+*/
+   struct {
+   unsigned int value;
+   } effective;
 };
 
 union rcu_special {
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dcbf22abd0bf..b2d438b6484b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1254,6 +1254,8 @@ static inline int alloc_uclamp_sched_group(struct 
task_group *tg,
 
for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
uc_se = >uclamp[clamp_id];
+   uc_se->effective.value =
+   parent->uclamp[clamp_id].effective.value;
uc_se->value = parent->uclamp[clamp_id].value;