When a task's util_clamp value is configured via sched_setattr, this value has to be properly accounted in the corresponding clamp group every time the task is enqueue and dequeued. When cgroups are also in use, per-task clamp values have to be aggregated to those of the CPU's controller's CGroup in which the task is currently living.
Let's update uclamp_cpu_get() to provide an aggregation between the task and the TG clamp values. Every time a task is enqueued, it will be accounted in the clamp_group which defines the smaller clamp value between the task and the TG's ones. This mimics what already happen for a task's CPU affinity mask when the task is also living in a cpuset. The overall idea is that: CGroups attributes are always used to restrict the per-task attributes. For consistency purposes, as well as to properly inform userspace, the sched_getattr call is updated to always return the properly aggregated constrains as described above. This will also make sched_getattr a convenient userpace API to know the utilization constraints enforced on a task by the CGroups's CPU controller. Signed-off-by: Patrick Bellasi <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Paul Turner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Steve Muckle <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Dietmar Eggemann <[email protected]> Cc: Morten Rasmussen <[email protected]> Cc: [email protected] Cc: [email protected] --- kernel/sched/core.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b8299a4f03e7..592de8d32427 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -966,9 +966,18 @@ static inline void uclamp_cpu_get(struct task_struct *p, int cpu, int clamp_id) clamp_value = p->uclamp[clamp_id].value; group_id = p->uclamp[clamp_id].group_id; +#ifdef CONFIG_UCLAMP_TASK_GROUP + /* Use TG's clamp value to limit task specific values */ + if (group_id == UCLAMP_NONE || + clamp_value >= task_group(p)->uclamp[clamp_id].value) { + clamp_value = task_group(p)->uclamp[clamp_id].value; + group_id = task_group(p)->uclamp[clamp_id].group_id; + } +#else /* No task specific clamp values: nothing to do */ if (group_id == UCLAMP_NONE) return; +#endif /* Increment the current group_id */ uc_cpu->group[group_id].tasks += 1; @@ -5401,6 +5410,12 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr, #ifdef CONFIG_UCLAMP_TASK attr.sched_util_min = p->uclamp[UCLAMP_MIN].value; attr.sched_util_max = p->uclamp[UCLAMP_MAX].value; +#ifdef CONFIG_UCLAMP_TASK_GROUP + if (task_group(p)->uclamp[UCLAMP_MIN].value < attr.sched_util_min) + attr.sched_util_min = task_group(p)->uclamp[UCLAMP_MIN].value; + if (task_group(p)->uclamp[UCLAMP_MAX].value < attr.sched_util_max) + attr.sched_util_max = task_group(p)->uclamp[UCLAMP_MAX].value; +#endif #endif rcu_read_unlock(); -- 2.15.1

