On Tue, Nov 8, 2016 at 3:28 PM, John Stultz <[email protected]> wrote: > This patch adds logic to allows a process to migrate other tasks > between cgroups if they have CAP_SYS_RESOURCE. > > In Android (where this feature originated), the ActivityManager tracks > various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM, > etc), and then as applications change states, the SchedPolicy logic > will migrate the application tasks between different cgroups used > to control the different application states (for example, there is a > background cpuset cgroup which can limit background tasks to stay > on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can > then further limit those background tasks to a small percentage of > that one cpu's cpu time). > > However, for security reasons, Android doesn't want to make the > system_server (the process that runs the ActivityManager and > SchedPolicy logic), run as root. So in the Android common.git > kernel, they have some logic to allow cgroups to loosen their > permissions so CAP_SYS_NICE tasks can migrate other tasks between > cgroups. > > I feel the approach taken there overloads CAP_SYS_NICE a bit much > for non-android environments. > > So this patch, as suggested by Michael Kerrisk, simply adds a > check for CAP_SYS_RESOURCE. > > I've tested this with AOSP master, and this seems to work well > as Zygote and system_server already use CAP_SYS_RESOURCE. I've > also submitted patches against the android-4.4 kernel to change > it to use CAP_SYS_RESOURCE, and the Android developers just merged > it. > > Cc: Tejun Heo <[email protected]> > Cc: Li Zefan <[email protected]> > Cc: Jonathan Corbet <[email protected]> > Cc: [email protected] > Cc: Android Kernel Team <[email protected]> > Cc: Rom Lemarchand <[email protected]> > Cc: Colin Cross <[email protected]> > Cc: Dmitry Shmidt <[email protected]> > Cc: Todd Kjos <[email protected]> > Cc: Christian Poetzsch <[email protected]> > Cc: Amit Pundir <[email protected]> > Cc: Dmitry Torokhov <[email protected]> > Cc: Kees Cook <[email protected]> > Cc: Serge E. Hallyn <[email protected]> > Cc: [email protected] > Acked-by: Serge Hallyn <[email protected]> > Signed-off-by: John Stultz <[email protected]> > --- > v2: Renamed to just CAP_CGROUP_MIGRATE as recommended by Tejun > v3: Switched to just using CAP_SYS_RESOURCE as suggested by Michael > v4: Send out properly folded down version of the patch. :P > --- > kernel/cgroup.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 85bc9be..866059a 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -2856,7 +2856,8 @@ static int cgroup_procs_write_permission(struct > task_struct *task, > */ > if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && > !uid_eq(cred->euid, tcred->uid) && > - !uid_eq(cred->euid, tcred->suid)) > + !uid_eq(cred->euid, tcred->suid) && > + !ns_capable(tcred->user_ns, CAP_SYS_RESOURCE)) > ret = -EACCES; > > if (!ret && cgroup_on_dfl(dst_cgrp)) { > -- > 2.7.4 >
Reviewed-by: Kees Cook <[email protected]> -Kees -- Kees Cook Nexus Security

