Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
On Wed, Nov 04, 2015 at 05:12:19AM +0100, Stephane Eranian wrote: > > This patch eliminates RCU violations detected by the RCU > checker (PROVE_RCU). The impact code paths were all related > to cgroup mode monitoring and involved access a task's cgrp. > > V2 is updated to include comments from PeterZ to eliminate > some of the warnings without grabbing the rcu_read lock because > we know we are already holding th ctx->lock which prevents > the cgroup from disappearing while we are accessing it. > The trick, as suggested by Peter, is to modify the > perf_cgroup_from_task() to take an extra boolean parameter > to allow bypassing the lockdep test in the task_subsys_cstate() > macros. This patch uses this approach to update all calls the > perf_cgroup_from_task(). > > In V3, we change the boolean parameter for a pointer to a > perf_event_context so we can check the ctx->lock explicitely. > This is more robust, than passing the boolean to express that > we know the lock is held. The code can change, and thus the > locking assumption, checking lockdep_is_held() ensures, > the proper locking is in place. Patch relative to tip.git > at commit 57ef9fc. So aside from the reported build fails; this is not suitable Changelog. Also, please split it at least two patches; as there are at least the two distinct issues here. One is the perf_cgroup_sched_{in,out} thing, which requires moving the rcu_read_lock bits around, the other is the timestamp bits which require the ctx argument. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
On Wed, Nov 04, 2015 at 05:12:19AM +0100, Stephane Eranian wrote: > > This patch eliminates RCU violations detected by the RCU > checker (PROVE_RCU). The impact code paths were all related > to cgroup mode monitoring and involved access a task's cgrp. > > V2 is updated to include comments from PeterZ to eliminate > some of the warnings without grabbing the rcu_read lock because > we know we are already holding th ctx->lock which prevents > the cgroup from disappearing while we are accessing it. > The trick, as suggested by Peter, is to modify the > perf_cgroup_from_task() to take an extra boolean parameter > to allow bypassing the lockdep test in the task_subsys_cstate() > macros. This patch uses this approach to update all calls the > perf_cgroup_from_task(). > > In V3, we change the boolean parameter for a pointer to a > perf_event_context so we can check the ctx->lock explicitely. > This is more robust, than passing the boolean to express that > we know the lock is held. The code can change, and thus the > locking assumption, checking lockdep_is_held() ensures, > the proper locking is in place. Patch relative to tip.git > at commit 57ef9fc. So aside from the reported build fails; this is not suitable Changelog. Also, please split it at least two patches; as there are at least the two distinct issues here. One is the perf_cgroup_sched_{in,out} thing, which requires moving the rcu_read_lock bits around, the other is the timestamp bits which require the ctx argument. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
Hi Stephane, [auto build test WARNING on: tip/perf/core] [also build test WARNING on: v4.3 next-20151103] url: https://github.com/0day-ci/linux/commits/Stephane-Eranian/perf-fix-RCU-issues-with-cgroup-monitoring-mode/20151104-121512 config: i386-randconfig-i0-201544 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): In file included from include/linux/trace_events.h:9:0, from include/trace/syscall.h:6, from include/linux/syscalls.h:81, from init/main.c:18: include/linux/perf_event.h: In function 'perf_cgroup_from_task': >> include/linux/perf_event.h:702:7: warning: unused variable 'safe' >> [-Wunused-variable] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ vim +/safe +702 include/linux/perf_event.h 686 u64 timestamp; 687 }; 688 689 struct perf_cgroup { 690 struct cgroup_subsys_state css; 691 struct perf_cgroup_info __percpu *info; 692 }; 693 694 /* 695 * Must ensure cgroup is pinned (css_get) before calling 696 * this function. In other words, we cannot call this function 697 * if there is no cgroup event for the current CPU context. 698 */ 699 static inline struct perf_cgroup * 700 perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) 701 { > 702 bool safe = ctx ? lockdep_is_held(>lock) : true; 703 return container_of(task_css_check(task, perf_event_cgrp_id, safe), 704 struct perf_cgroup, css); 705 } 706 #endif /* CONFIG_CGROUP_PERF */ 707 708 #ifdef CONFIG_PERF_EVENTS 709 710 extern void *perf_aux_output_begin(struct perf_output_handle *handle, --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
Hi Stephane, [auto build test ERROR on: tip/perf/core] [also build test ERROR on: v4.3 next-20151103] url: https://github.com/0day-ci/linux/commits/Stephane-Eranian/perf-fix-RCU-issues-with-cgroup-monitoring-mode/20151104-121512 config: parisc-allyesconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=parisc All errors (new ones prefixed by >>): In file included from include/linux/trace_events.h:9:0, from include/trace/syscall.h:6, from include/linux/syscalls.h:81, from kernel/events/core.c:34: include/linux/perf_event.h: In function 'perf_cgroup_from_task': >> include/linux/perf_event.h:702:2: error: implicit declaration of function >> 'lockdep_is_held' [-Werror=implicit-function-declaration] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ include/linux/perf_event.h:702:7: warning: unused variable 'safe' [-Wunused-variable] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ cc1: some warnings being treated as errors vim +/lockdep_is_held +702 include/linux/perf_event.h 696 * this function. In other words, we cannot call this function 697 * if there is no cgroup event for the current CPU context. 698 */ 699 static inline struct perf_cgroup * 700 perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) 701 { > 702 bool safe = ctx ? lockdep_is_held(>lock) : true; 703 return container_of(task_css_check(task, perf_event_cgrp_id, safe), 704 struct perf_cgroup, css); 705 } --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
[PATCH v3] perf: fix RCU issues with cgroup monitoring mode
This patch eliminates RCU violations detected by the RCU checker (PROVE_RCU). The impact code paths were all related to cgroup mode monitoring and involved access a task's cgrp. V2 is updated to include comments from PeterZ to eliminate some of the warnings without grabbing the rcu_read lock because we know we are already holding th ctx->lock which prevents the cgroup from disappearing while we are accessing it. The trick, as suggested by Peter, is to modify the perf_cgroup_from_task() to take an extra boolean parameter to allow bypassing the lockdep test in the task_subsys_cstate() macros. This patch uses this approach to update all calls the perf_cgroup_from_task(). In V3, we change the boolean parameter for a pointer to a perf_event_context so we can check the ctx->lock explicitely. This is more robust, than passing the boolean to express that we know the lock is held. The code can change, and thus the locking assumption, checking lockdep_is_held() ensures, the proper locking is in place. Patch relative to tip.git at commit 57ef9fc. Signed-off-by: Stephane Eranian --- arch/x86/kernel/cpu/perf_event_intel_cqm.c | 2 +- include/linux/perf_event.h | 5 +++-- kernel/events/core.c | 25 +++-- 3 files changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 377e8f8..a316ca9 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -298,7 +298,7 @@ static bool __match_event(struct perf_event *a, struct perf_event *b) static inline struct perf_cgroup *event_to_cgroup(struct perf_event *event) { if (event->attach_state & PERF_ATTACH_TASK) - return perf_cgroup_from_task(event->hw.target); + return perf_cgroup_from_task(event->hw.target, event->ctx); return event->cgrp; } diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index d841d33..94107e4 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -697,9 +697,10 @@ struct perf_cgroup { * if there is no cgroup event for the current CPU context. */ static inline struct perf_cgroup * -perf_cgroup_from_task(struct task_struct *task) +perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) { - return container_of(task_css(task, perf_event_cgrp_id), + bool safe = ctx ? lockdep_is_held(>lock) : true; + return container_of(task_css_check(task, perf_event_cgrp_id, safe), struct perf_cgroup, css); } #endif /* CONFIG_CGROUP_PERF */ diff --git a/kernel/events/core.c b/kernel/events/core.c index ea02109..f611246 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -435,7 +435,7 @@ static inline void update_cgrp_time_from_event(struct perf_event *event) if (!is_cgroup_event(event)) return; - cgrp = perf_cgroup_from_task(current); + cgrp = perf_cgroup_from_task(current, event->ctx); /* * Do not update time when cgroup is not active */ @@ -458,7 +458,7 @@ perf_cgroup_set_timestamp(struct task_struct *task, if (!task || !ctx->nr_cgroups) return; - cgrp = perf_cgroup_from_task(task); + cgrp = perf_cgroup_from_task(task, ctx); info = this_cpu_ptr(cgrp->info); info->timestamp = ctx->timestamp; } @@ -489,7 +489,6 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) * we reschedule only in the presence of cgroup * constrained events. */ - rcu_read_lock(); list_for_each_entry_rcu(pmu, , entry) { cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); @@ -523,7 +522,7 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) * event_filter_match() to not have to pass * task around */ - cpuctx->cgrp = perf_cgroup_from_task(task); + cpuctx->cgrp = perf_cgroup_from_task(task, NULL); cpu_ctx_sched_in(cpuctx, EVENT_ALL, task); } perf_pmu_enable(cpuctx->ctx.pmu); @@ -531,8 +530,6 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) } } - rcu_read_unlock(); - local_irq_restore(flags); } @@ -542,17 +539,18 @@ static inline void perf_cgroup_sched_out(struct task_struct *task, struct perf_cgroup *cgrp1; struct perf_cgroup *cgrp2 = NULL; + rcu_read_lock(); /* * we come here when we know perf_cgroup_events > 0 */ - cgrp1 = perf_cgroup_from_task(task); + cgrp1 = perf_cgroup_from_task(task, NULL); /* * next is NULL when called from
Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
Hi Stephane, [auto build test ERROR on: tip/perf/core] [also build test ERROR on: v4.3 next-20151103] url: https://github.com/0day-ci/linux/commits/Stephane-Eranian/perf-fix-RCU-issues-with-cgroup-monitoring-mode/20151104-121512 config: parisc-allyesconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=parisc All errors (new ones prefixed by >>): In file included from include/linux/trace_events.h:9:0, from include/trace/syscall.h:6, from include/linux/syscalls.h:81, from kernel/events/core.c:34: include/linux/perf_event.h: In function 'perf_cgroup_from_task': >> include/linux/perf_event.h:702:2: error: implicit declaration of function >> 'lockdep_is_held' [-Werror=implicit-function-declaration] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ include/linux/perf_event.h:702:7: warning: unused variable 'safe' [-Wunused-variable] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ cc1: some warnings being treated as errors vim +/lockdep_is_held +702 include/linux/perf_event.h 696 * this function. In other words, we cannot call this function 697 * if there is no cgroup event for the current CPU context. 698 */ 699 static inline struct perf_cgroup * 700 perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) 701 { > 702 bool safe = ctx ? lockdep_is_held(>lock) : true; 703 return container_of(task_css_check(task, perf_event_cgrp_id, safe), 704 struct perf_cgroup, css); 705 } --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
Re: [PATCH v3] perf: fix RCU issues with cgroup monitoring mode
Hi Stephane, [auto build test WARNING on: tip/perf/core] [also build test WARNING on: v4.3 next-20151103] url: https://github.com/0day-ci/linux/commits/Stephane-Eranian/perf-fix-RCU-issues-with-cgroup-monitoring-mode/20151104-121512 config: i386-randconfig-i0-201544 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): In file included from include/linux/trace_events.h:9:0, from include/trace/syscall.h:6, from include/linux/syscalls.h:81, from init/main.c:18: include/linux/perf_event.h: In function 'perf_cgroup_from_task': >> include/linux/perf_event.h:702:7: warning: unused variable 'safe' >> [-Wunused-variable] bool safe = ctx ? lockdep_is_held(>lock) : true; ^ vim +/safe +702 include/linux/perf_event.h 686 u64 timestamp; 687 }; 688 689 struct perf_cgroup { 690 struct cgroup_subsys_state css; 691 struct perf_cgroup_info __percpu *info; 692 }; 693 694 /* 695 * Must ensure cgroup is pinned (css_get) before calling 696 * this function. In other words, we cannot call this function 697 * if there is no cgroup event for the current CPU context. 698 */ 699 static inline struct perf_cgroup * 700 perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) 701 { > 702 bool safe = ctx ? lockdep_is_held(>lock) : true; 703 return container_of(task_css_check(task, perf_event_cgrp_id, safe), 704 struct perf_cgroup, css); 705 } 706 #endif /* CONFIG_CGROUP_PERF */ 707 708 #ifdef CONFIG_PERF_EVENTS 709 710 extern void *perf_aux_output_begin(struct perf_output_handle *handle, --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data
[PATCH v3] perf: fix RCU issues with cgroup monitoring mode
This patch eliminates RCU violations detected by the RCU checker (PROVE_RCU). The impact code paths were all related to cgroup mode monitoring and involved access a task's cgrp. V2 is updated to include comments from PeterZ to eliminate some of the warnings without grabbing the rcu_read lock because we know we are already holding th ctx->lock which prevents the cgroup from disappearing while we are accessing it. The trick, as suggested by Peter, is to modify the perf_cgroup_from_task() to take an extra boolean parameter to allow bypassing the lockdep test in the task_subsys_cstate() macros. This patch uses this approach to update all calls the perf_cgroup_from_task(). In V3, we change the boolean parameter for a pointer to a perf_event_context so we can check the ctx->lock explicitely. This is more robust, than passing the boolean to express that we know the lock is held. The code can change, and thus the locking assumption, checking lockdep_is_held() ensures, the proper locking is in place. Patch relative to tip.git at commit 57ef9fc. Signed-off-by: Stephane Eranian--- arch/x86/kernel/cpu/perf_event_intel_cqm.c | 2 +- include/linux/perf_event.h | 5 +++-- kernel/events/core.c | 25 +++-- 3 files changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c index 377e8f8..a316ca9 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c @@ -298,7 +298,7 @@ static bool __match_event(struct perf_event *a, struct perf_event *b) static inline struct perf_cgroup *event_to_cgroup(struct perf_event *event) { if (event->attach_state & PERF_ATTACH_TASK) - return perf_cgroup_from_task(event->hw.target); + return perf_cgroup_from_task(event->hw.target, event->ctx); return event->cgrp; } diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index d841d33..94107e4 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -697,9 +697,10 @@ struct perf_cgroup { * if there is no cgroup event for the current CPU context. */ static inline struct perf_cgroup * -perf_cgroup_from_task(struct task_struct *task) +perf_cgroup_from_task(struct task_struct *task, struct perf_event_context *ctx) { - return container_of(task_css(task, perf_event_cgrp_id), + bool safe = ctx ? lockdep_is_held(>lock) : true; + return container_of(task_css_check(task, perf_event_cgrp_id, safe), struct perf_cgroup, css); } #endif /* CONFIG_CGROUP_PERF */ diff --git a/kernel/events/core.c b/kernel/events/core.c index ea02109..f611246 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -435,7 +435,7 @@ static inline void update_cgrp_time_from_event(struct perf_event *event) if (!is_cgroup_event(event)) return; - cgrp = perf_cgroup_from_task(current); + cgrp = perf_cgroup_from_task(current, event->ctx); /* * Do not update time when cgroup is not active */ @@ -458,7 +458,7 @@ perf_cgroup_set_timestamp(struct task_struct *task, if (!task || !ctx->nr_cgroups) return; - cgrp = perf_cgroup_from_task(task); + cgrp = perf_cgroup_from_task(task, ctx); info = this_cpu_ptr(cgrp->info); info->timestamp = ctx->timestamp; } @@ -489,7 +489,6 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) * we reschedule only in the presence of cgroup * constrained events. */ - rcu_read_lock(); list_for_each_entry_rcu(pmu, , entry) { cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); @@ -523,7 +522,7 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) * event_filter_match() to not have to pass * task around */ - cpuctx->cgrp = perf_cgroup_from_task(task); + cpuctx->cgrp = perf_cgroup_from_task(task, NULL); cpu_ctx_sched_in(cpuctx, EVENT_ALL, task); } perf_pmu_enable(cpuctx->ctx.pmu); @@ -531,8 +530,6 @@ static void perf_cgroup_switch(struct task_struct *task, int mode) } } - rcu_read_unlock(); - local_irq_restore(flags); } @@ -542,17 +539,18 @@ static inline void perf_cgroup_sched_out(struct task_struct *task, struct perf_cgroup *cgrp1; struct perf_cgroup *cgrp2 = NULL; + rcu_read_lock(); /* * we come here when we know perf_cgroup_events > 0 */ - cgrp1 = perf_cgroup_from_task(task); + cgrp1 = perf_cgroup_from_task(task, NULL); /* * next is NULL when called