[PATCH v2 1/2] cgroup: delay the clearing of cgrp->kn->priv
Run these two scripts concurrently: for ((; ;)) { mkdir /cgroup/sub rmdir /cgroup/sub } for ((; ;)) { echo $$ > /cgroup/sub/cgroup.procs echo $$ > /cgroup/cgroup.procs } A kernel bug will be triggered: BUG: unable to handle kernel NULL pointer dereference at 0038 IP: [] cgroup_put+0x9/0x80 ... Call Trace: [] cgroup_kn_unlock+0x39/0x50 [] cgroup_kn_lock_live+0x61/0x70 [] __cgroup_procs_write.isra.26+0x51/0x230 [] cgroup_tasks_write+0x12/0x20 [] cgroup_file_write+0x40/0x130 [] kernfs_fop_write+0xd1/0x160 [] vfs_write+0x98/0x1e0 [] SyS_write+0x4d/0xa0 [] sysenter_do_call+0x12/0x12 We clear cgrp->kn->priv in the end of cgroup_rmdir(), but another concurrent thread can access kn->priv after the clearing. We should move the clearing to css_release_work_fn(). At that time no one is holding reference to the cgroup and no one can gain a new reference to access it. v2: - remove RCU_INIT_POINTER() into the else block. (Tejun) - remove the cgroup_parent() check. (Tejun) - update the comment in css_tryget_online_from_dir(). Cc: # 3.15+ Reported-by: Toralf Förster Signed-off-by: Zefan Li --- kernel/cgroup.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 1c56924..205f793 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4181,6 +4181,15 @@ static void css_release_work_fn(struct work_struct *work) /* cgroup release path */ cgroup_idr_remove(>root->cgroup_idr, cgrp->id); cgrp->id = -1; + + /* +* There are two control paths which try to determine +* cgroup from dentry without going through kernfs - +* cgroupstats_build() and css_tryget_online_from_dir(). +* Those are supported by RCU protecting clearing of +* cgrp->kn->priv backpointer. +*/ + RCU_INIT_POINTER(*(void __rcu __force **)>kn->priv, NULL); } mutex_unlock(_mutex); @@ -4601,16 +4610,6 @@ static int cgroup_rmdir(struct kernfs_node *kn) cgroup_kn_unlock(kn); - /* -* There are two control paths which try to determine cgroup from -* dentry without going through kernfs - cgroupstats_build() and -* css_tryget_online_from_dir(). Those are supported by RCU -* protecting clearing of cgrp->kn->priv backpointer, which should -* happen after all files under it have been removed. -*/ - if (!ret) - RCU_INIT_POINTER(*(void __rcu __force **)>priv, NULL); - cgroup_put(cgrp); return ret; } @@ -5175,7 +5174,7 @@ struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, /* * This path doesn't originate from kernfs and @kn could already * have been or be removed at any point. @kn->priv is RCU -* protected for this access. See cgroup_rmdir() for details. +* protected for this access. See css_release_work_fn() for details. */ cgrp = rcu_dereference(kn->priv); if (cgrp) -- 1.8.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] cgroup: delay the clearing of cgrp-kn-priv
Run these two scripts concurrently: for ((; ;)) { mkdir /cgroup/sub rmdir /cgroup/sub } for ((; ;)) { echo $$ /cgroup/sub/cgroup.procs echo $$ /cgroup/cgroup.procs } A kernel bug will be triggered: BUG: unable to handle kernel NULL pointer dereference at 0038 IP: [c10bbd69] cgroup_put+0x9/0x80 ... Call Trace: [c10bbe19] cgroup_kn_unlock+0x39/0x50 [c10bbe91] cgroup_kn_lock_live+0x61/0x70 [c10be3c1] __cgroup_procs_write.isra.26+0x51/0x230 [c10be5b2] cgroup_tasks_write+0x12/0x20 [c10bb7b0] cgroup_file_write+0x40/0x130 [c11aee71] kernfs_fop_write+0xd1/0x160 [c1148e58] vfs_write+0x98/0x1e0 [c114934d] SyS_write+0x4d/0xa0 [c16f656b] sysenter_do_call+0x12/0x12 We clear cgrp-kn-priv in the end of cgroup_rmdir(), but another concurrent thread can access kn-priv after the clearing. We should move the clearing to css_release_work_fn(). At that time no one is holding reference to the cgroup and no one can gain a new reference to access it. v2: - remove RCU_INIT_POINTER() into the else block. (Tejun) - remove the cgroup_parent() check. (Tejun) - update the comment in css_tryget_online_from_dir(). Cc: sta...@vger.kernel.org # 3.15+ Reported-by: Toralf Förster toralf.foers...@gmx.de Signed-off-by: Zefan Li lize...@huawei.com --- kernel/cgroup.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 1c56924..205f793 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4181,6 +4181,15 @@ static void css_release_work_fn(struct work_struct *work) /* cgroup release path */ cgroup_idr_remove(cgrp-root-cgroup_idr, cgrp-id); cgrp-id = -1; + + /* +* There are two control paths which try to determine +* cgroup from dentry without going through kernfs - +* cgroupstats_build() and css_tryget_online_from_dir(). +* Those are supported by RCU protecting clearing of +* cgrp-kn-priv backpointer. +*/ + RCU_INIT_POINTER(*(void __rcu __force **)cgrp-kn-priv, NULL); } mutex_unlock(cgroup_mutex); @@ -4601,16 +4610,6 @@ static int cgroup_rmdir(struct kernfs_node *kn) cgroup_kn_unlock(kn); - /* -* There are two control paths which try to determine cgroup from -* dentry without going through kernfs - cgroupstats_build() and -* css_tryget_online_from_dir(). Those are supported by RCU -* protecting clearing of cgrp-kn-priv backpointer, which should -* happen after all files under it have been removed. -*/ - if (!ret) - RCU_INIT_POINTER(*(void __rcu __force **)kn-priv, NULL); - cgroup_put(cgrp); return ret; } @@ -5175,7 +5174,7 @@ struct cgroup_subsys_state *css_tryget_online_from_dir(struct dentry *dentry, /* * This path doesn't originate from kernfs and @kn could already * have been or be removed at any point. @kn-priv is RCU -* protected for this access. See cgroup_rmdir() for details. +* protected for this access. See css_release_work_fn() for details. */ cgrp = rcu_dereference(kn-priv); if (cgrp) -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/