Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-12 20:24 GMT+08:00 Peter Zijlstra: > On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote: > >> >> + /* >> >> + * if only the cgroup is running on this cpu >> >> + * and cpuctx->cgrp == NULL (otherwise it would've >> >> + * been set with running cgroup), we put this cgroup >> >> + * into cpu context. Or it would case mismatch in >> >> + * following cgroup events at event_filter_match() >> >> + */ >> > >> > This is utterly incomprehensible, what? >> >> Yes, this is bit messy. I should've made it clear. This comment was supposed >> to explain the reason why I modified the if statement below. >> >> And the logic is >> >> 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it >> appropriately, that means, we __have to__ check if the cgroup is running >> on the cpu > > Yes, however, the changelog needs to explain why this was > changed, and the above does not explain where the old code was wrong. > Yes, it's good to be involved in community since you guys give great opinions > In what case do we fail to do 1 with the current code? > With current code, it doesn't check if the cgroup is running on the CPU. then problem happens when two cgroup events on the same CPU are opened in a row, which means no schedule happen between it 1) event A on cgroup A has no tasks running on this CPU 2) event B on cgroup B has some task running on the CPU With the current code, cpuctx->cgrp would be set as cgroup A arbitrarily, then it opens event B, cpuctx->cgrp will not be set anymore, and it fails in event_filter_match() because cpuctx->cgrp != cgroup B. We have to wait schedule() happened to correct it. But in this situation, we lost a slight period that measures event B. > The existing nr_cgroups logic ensures we only continue for the > first/last cgroup event for that context. Is the problem that if the > first cgroup is _not_ of the current cgroup and we correctly do _not_ > set cpuctx->cgrp, a second event that _is_ of the right cgroup will then > not have the opportunity to set cpuctx->cgrp ? > Yes, You got it. >> 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been >> set appropriately by cgroup_switch() or list_update_cgroup_event() before. >> Therefore, We do __nothing__ here > > Right, but my variant doubled checked it. If its _not_ NULL it must be > the current task's cgrp, otherwise we have a problem. Same difference, > more paranoia :-) > > But I suppose you're right and we can avoid a bunch of looking up if > cpuctx->cgrp is already set. > Yes, actually you have a good instinct. But what my further concern is when we should set cpuctx->cgrp to NULL again when one running event is closed ? By this way, we can avoid a bunch of checking cgroup_is_descendant(). I didn't figure it out in this patch, and of course I didn't make it worse. :-) > > How is the below patch? > You're AWESOME : ) > --- > Subject: perf/core: Fix installing cgroup events on CPU > From: "leilei.lin" > Date: Tue, 6 Mar 2018 17:36:37 +0800 > > There's two problems when installing cgroup events on CPUs: firstly > list_update_cgroup_event() only tries to set cpuctx->cgrp for the > first event, if that mismatches on @cgrp we'll not try again for later > additions. > > Secondly, when we install a cgroup event into an active context, only > issue an event reprogram when the event matches the current cgroup > context. This avoids a pointless event reprogramming. > > Cc: a...@kernel.org > Cc: yang_oli...@hotmail.com > Cc: mi...@redhat.com > Cc: jo...@redhat.com > Cc: eran...@gmail.com > Cc: torva...@linux-foundation.org > Cc: t...@linutronix.de > Cc: brendan.d.gr...@gmail.com > Cc: alexander.shish...@linux.intel.com > Signed-off-by: leilei.lin > [peterz: Changelog and comments] > Signed-off-by: Peter Zijlstra (Intel) > Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiu...@gmail.com > --- > kernel/events/core.c | 46 +++--- > 1 file changed, 35 insertions(+), 11 deletions(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve > if (!is_cgroup_event(event)) > return; > > - if (add && ctx->nr_cgroups++) > - return; > - else if (!add && --ctx->nr_cgroups) > - return; > /* > * Because cgroup events are always per-cpu events, > * this will always be called from the right CPU. > */ > cpuctx = __get_cpu_context(ctx); > - cpuctx_entry = >cgrp_cpuctx_entry; > - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU > .*/ > - if (add) { > + > + /* > +* Since setting cpuctx->cgrp is conditional on the current @cgrp > +* matching the event's cgroup, we must do this for every new event, > +
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-12 20:24 GMT+08:00 Peter Zijlstra : > On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote: > >> >> + /* >> >> + * if only the cgroup is running on this cpu >> >> + * and cpuctx->cgrp == NULL (otherwise it would've >> >> + * been set with running cgroup), we put this cgroup >> >> + * into cpu context. Or it would case mismatch in >> >> + * following cgroup events at event_filter_match() >> >> + */ >> > >> > This is utterly incomprehensible, what? >> >> Yes, this is bit messy. I should've made it clear. This comment was supposed >> to explain the reason why I modified the if statement below. >> >> And the logic is >> >> 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it >> appropriately, that means, we __have to__ check if the cgroup is running >> on the cpu > > Yes, however, the changelog needs to explain why this was > changed, and the above does not explain where the old code was wrong. > Yes, it's good to be involved in community since you guys give great opinions > In what case do we fail to do 1 with the current code? > With current code, it doesn't check if the cgroup is running on the CPU. then problem happens when two cgroup events on the same CPU are opened in a row, which means no schedule happen between it 1) event A on cgroup A has no tasks running on this CPU 2) event B on cgroup B has some task running on the CPU With the current code, cpuctx->cgrp would be set as cgroup A arbitrarily, then it opens event B, cpuctx->cgrp will not be set anymore, and it fails in event_filter_match() because cpuctx->cgrp != cgroup B. We have to wait schedule() happened to correct it. But in this situation, we lost a slight period that measures event B. > The existing nr_cgroups logic ensures we only continue for the > first/last cgroup event for that context. Is the problem that if the > first cgroup is _not_ of the current cgroup and we correctly do _not_ > set cpuctx->cgrp, a second event that _is_ of the right cgroup will then > not have the opportunity to set cpuctx->cgrp ? > Yes, You got it. >> 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been >> set appropriately by cgroup_switch() or list_update_cgroup_event() before. >> Therefore, We do __nothing__ here > > Right, but my variant doubled checked it. If its _not_ NULL it must be > the current task's cgrp, otherwise we have a problem. Same difference, > more paranoia :-) > > But I suppose you're right and we can avoid a bunch of looking up if > cpuctx->cgrp is already set. > Yes, actually you have a good instinct. But what my further concern is when we should set cpuctx->cgrp to NULL again when one running event is closed ? By this way, we can avoid a bunch of checking cgroup_is_descendant(). I didn't figure it out in this patch, and of course I didn't make it worse. :-) > > How is the below patch? > You're AWESOME : ) > --- > Subject: perf/core: Fix installing cgroup events on CPU > From: "leilei.lin" > Date: Tue, 6 Mar 2018 17:36:37 +0800 > > There's two problems when installing cgroup events on CPUs: firstly > list_update_cgroup_event() only tries to set cpuctx->cgrp for the > first event, if that mismatches on @cgrp we'll not try again for later > additions. > > Secondly, when we install a cgroup event into an active context, only > issue an event reprogram when the event matches the current cgroup > context. This avoids a pointless event reprogramming. > > Cc: a...@kernel.org > Cc: yang_oli...@hotmail.com > Cc: mi...@redhat.com > Cc: jo...@redhat.com > Cc: eran...@gmail.com > Cc: torva...@linux-foundation.org > Cc: t...@linutronix.de > Cc: brendan.d.gr...@gmail.com > Cc: alexander.shish...@linux.intel.com > Signed-off-by: leilei.lin > [peterz: Changelog and comments] > Signed-off-by: Peter Zijlstra (Intel) > Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiu...@gmail.com > --- > kernel/events/core.c | 46 +++--- > 1 file changed, 35 insertions(+), 11 deletions(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve > if (!is_cgroup_event(event)) > return; > > - if (add && ctx->nr_cgroups++) > - return; > - else if (!add && --ctx->nr_cgroups) > - return; > /* > * Because cgroup events are always per-cpu events, > * this will always be called from the right CPU. > */ > cpuctx = __get_cpu_context(ctx); > - cpuctx_entry = >cgrp_cpuctx_entry; > - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU > .*/ > - if (add) { > + > + /* > +* Since setting cpuctx->cgrp is conditional on the current @cgrp > +* matching the event's cgroup, we must do this for every new event, > +* because if the first would mismatch, the second would not try again > +* and we would
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote: > >> + /* > >> + * if only the cgroup is running on this cpu > >> + * and cpuctx->cgrp == NULL (otherwise it would've > >> + * been set with running cgroup), we put this cgroup > >> + * into cpu context. Or it would case mismatch in > >> + * following cgroup events at event_filter_match() > >> + */ > > > > This is utterly incomprehensible, what? > > Yes, this is bit messy. I should've made it clear. This comment was supposed > to explain the reason why I modified the if statement below. > > And the logic is > > 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it > appropriately, that means, we __have to__ check if the cgroup is running > on the cpu Yes, however, the changelog needs to explain why this was changed, and the above does not explain where the old code was wrong. In what case do we fail to do 1 with the current code? The existing nr_cgroups logic ensures we only continue for the first/last cgroup event for that context. Is the problem that if the first cgroup is _not_ of the current cgroup and we correctly do _not_ set cpuctx->cgrp, a second event that _is_ of the right cgroup will then not have the opportunity to set cpuctx->cgrp ? > 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been > set appropriately by cgroup_switch() or list_update_cgroup_event() before. > Therefore, We do __nothing__ here Right, but my variant doubled checked it. If its _not_ NULL it must be the current task's cgrp, otherwise we have a problem. Same difference, more paranoia :-) But I suppose you're right and we can avoid a bunch of looking up if cpuctx->cgrp is already set. How is the below patch? --- Subject: perf/core: Fix installing cgroup events on CPU From: "leilei.lin"Date: Tue, 6 Mar 2018 17:36:37 +0800 There's two problems when installing cgroup events on CPUs: firstly list_update_cgroup_event() only tries to set cpuctx->cgrp for the first event, if that mismatches on @cgrp we'll not try again for later additions. Secondly, when we install a cgroup event into an active context, only issue an event reprogram when the event matches the current cgroup context. This avoids a pointless event reprogramming. Cc: a...@kernel.org Cc: yang_oli...@hotmail.com Cc: mi...@redhat.com Cc: jo...@redhat.com Cc: eran...@gmail.com Cc: torva...@linux-foundation.org Cc: t...@linutronix.de Cc: brendan.d.gr...@gmail.com Cc: alexander.shish...@linux.intel.com Signed-off-by: leilei.lin [peterz: Changelog and comments] Signed-off-by: Peter Zijlstra (Intel) Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiu...@gmail.com --- kernel/events/core.c | 46 +++--- 1 file changed, 35 insertions(+), 11 deletions(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve if (!is_cgroup_event(event)) return; - if (add && ctx->nr_cgroups++) - return; - else if (!add && --ctx->nr_cgroups) - return; /* * Because cgroup events are always per-cpu events, * this will always be called from the right CPU. */ cpuctx = __get_cpu_context(ctx); - cpuctx_entry = >cgrp_cpuctx_entry; - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/ - if (add) { + + /* +* Since setting cpuctx->cgrp is conditional on the current @cgrp +* matching the event's cgroup, we must do this for every new event, +* because if the first would mismatch, the second would not try again +* and we would leave cpuctx->cgrp unset. +*/ + if (add && !cpuctx->cgrp) { struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); - list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) cpuctx->cgrp = cgrp; - } else { - list_del(cpuctx_entry); - cpuctx->cgrp = NULL; } + + if (add && ctx->nr_cgroups++) + return; + else if (!add && --ctx->nr_cgroups) + return; + + /* no cgroup running */ + if (!add) + cpuctx->cgrp = NULL; + + cpuctx_entry = >cgrp_cpuctx_entry; + if (add) + list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); + else + list_del(cpuctx_entry); } #else /* !CONFIG_CGROUP_PERF */ @@ -2491,6 +2503,18 @@ static int __perf_install_in_context(vo raw_spin_lock(_ctx->lock); } +#ifdef CONFIG_CGROUP_PERF + if (is_cgroup_event(event)) { + /* +* If the current cgroup doesn't match the event's +*
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote: > >> + /* > >> + * if only the cgroup is running on this cpu > >> + * and cpuctx->cgrp == NULL (otherwise it would've > >> + * been set with running cgroup), we put this cgroup > >> + * into cpu context. Or it would case mismatch in > >> + * following cgroup events at event_filter_match() > >> + */ > > > > This is utterly incomprehensible, what? > > Yes, this is bit messy. I should've made it clear. This comment was supposed > to explain the reason why I modified the if statement below. > > And the logic is > > 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it > appropriately, that means, we __have to__ check if the cgroup is running > on the cpu Yes, however, the changelog needs to explain why this was changed, and the above does not explain where the old code was wrong. In what case do we fail to do 1 with the current code? The existing nr_cgroups logic ensures we only continue for the first/last cgroup event for that context. Is the problem that if the first cgroup is _not_ of the current cgroup and we correctly do _not_ set cpuctx->cgrp, a second event that _is_ of the right cgroup will then not have the opportunity to set cpuctx->cgrp ? > 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been > set appropriately by cgroup_switch() or list_update_cgroup_event() before. > Therefore, We do __nothing__ here Right, but my variant doubled checked it. If its _not_ NULL it must be the current task's cgrp, otherwise we have a problem. Same difference, more paranoia :-) But I suppose you're right and we can avoid a bunch of looking up if cpuctx->cgrp is already set. How is the below patch? --- Subject: perf/core: Fix installing cgroup events on CPU From: "leilei.lin" Date: Tue, 6 Mar 2018 17:36:37 +0800 There's two problems when installing cgroup events on CPUs: firstly list_update_cgroup_event() only tries to set cpuctx->cgrp for the first event, if that mismatches on @cgrp we'll not try again for later additions. Secondly, when we install a cgroup event into an active context, only issue an event reprogram when the event matches the current cgroup context. This avoids a pointless event reprogramming. Cc: a...@kernel.org Cc: yang_oli...@hotmail.com Cc: mi...@redhat.com Cc: jo...@redhat.com Cc: eran...@gmail.com Cc: torva...@linux-foundation.org Cc: t...@linutronix.de Cc: brendan.d.gr...@gmail.com Cc: alexander.shish...@linux.intel.com Signed-off-by: leilei.lin [peterz: Changelog and comments] Signed-off-by: Peter Zijlstra (Intel) Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiu...@gmail.com --- kernel/events/core.c | 46 +++--- 1 file changed, 35 insertions(+), 11 deletions(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve if (!is_cgroup_event(event)) return; - if (add && ctx->nr_cgroups++) - return; - else if (!add && --ctx->nr_cgroups) - return; /* * Because cgroup events are always per-cpu events, * this will always be called from the right CPU. */ cpuctx = __get_cpu_context(ctx); - cpuctx_entry = >cgrp_cpuctx_entry; - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/ - if (add) { + + /* +* Since setting cpuctx->cgrp is conditional on the current @cgrp +* matching the event's cgroup, we must do this for every new event, +* because if the first would mismatch, the second would not try again +* and we would leave cpuctx->cgrp unset. +*/ + if (add && !cpuctx->cgrp) { struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); - list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) cpuctx->cgrp = cgrp; - } else { - list_del(cpuctx_entry); - cpuctx->cgrp = NULL; } + + if (add && ctx->nr_cgroups++) + return; + else if (!add && --ctx->nr_cgroups) + return; + + /* no cgroup running */ + if (!add) + cpuctx->cgrp = NULL; + + cpuctx_entry = >cgrp_cpuctx_entry; + if (add) + list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); + else + list_del(cpuctx_entry); } #else /* !CONFIG_CGROUP_PERF */ @@ -2491,6 +2503,18 @@ static int __perf_install_in_context(vo raw_spin_lock(_ctx->lock); } +#ifdef CONFIG_CGROUP_PERF + if (is_cgroup_event(event)) { + /* +* If the current cgroup doesn't match the event's +* cgroup, we should not try to schedule it. +*/ +
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-12 15:53 GMT+08:00 Ingo Molnar: > > * linxiu...@gmail.com wrote: > >> /* >>* Because cgroup events are always per-cpu events, >>* this will always be called from the right CPU. >>*/ > >> + /* >> + * if only the cgroup is running on this cpu >> + * and cpuctx->cgrp == NULL (otherwise it would've >> + * been set with running cgroup), we put this cgroup >> + * into cpu context. Or it would case mismatch in >> + * following cgroup events at event_filter_match() >> + */ > > Beyond making sure that what you comment on makes sense, please also follow > existing comment style, which I quoted above. > > There's 3 separate mistakes in that paragraph alone. > > Thanks, > > Ingo Thank Ingo for pointing it out. though I'd try to follow it as possible as I could. I wish I have somewhere to check style with specific documents : ) + /* +* We will set cpuctx->cgrp only if: +* 1) cpuctx->cgrp is NULL. Because cpuctx->cgrp was set +* correctly by sched_in(). We don't need to set it again. +* 2) cgroup related to this event is running on this CPU. +* Otherwise it will fail in event_filter_match() +*/ Peter, did I explain that logic clearly and prove it right? Regards
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-12 15:53 GMT+08:00 Ingo Molnar : > > * linxiu...@gmail.com wrote: > >> /* >>* Because cgroup events are always per-cpu events, >>* this will always be called from the right CPU. >>*/ > >> + /* >> + * if only the cgroup is running on this cpu >> + * and cpuctx->cgrp == NULL (otherwise it would've >> + * been set with running cgroup), we put this cgroup >> + * into cpu context. Or it would case mismatch in >> + * following cgroup events at event_filter_match() >> + */ > > Beyond making sure that what you comment on makes sense, please also follow > existing comment style, which I quoted above. > > There's 3 separate mistakes in that paragraph alone. > > Thanks, > > Ingo Thank Ingo for pointing it out. though I'd try to follow it as possible as I could. I wish I have somewhere to check style with specific documents : ) + /* +* We will set cpuctx->cgrp only if: +* 1) cpuctx->cgrp is NULL. Because cpuctx->cgrp was set +* correctly by sched_in(). We don't need to set it again. +* 2) cgroup related to this event is running on this CPU. +* Otherwise it will fail in event_filter_match() +*/ Peter, did I explain that logic clearly and prove it right? Regards
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
* linxiu...@gmail.comwrote: > /* >* Because cgroup events are always per-cpu events, >* this will always be called from the right CPU. >*/ > + /* > + * if only the cgroup is running on this cpu > + * and cpuctx->cgrp == NULL (otherwise it would've > + * been set with running cgroup), we put this cgroup > + * into cpu context. Or it would case mismatch in > + * following cgroup events at event_filter_match() > + */ Beyond making sure that what you comment on makes sense, please also follow existing comment style, which I quoted above. There's 3 separate mistakes in that paragraph alone. Thanks, Ingo
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
* linxiu...@gmail.com wrote: > /* >* Because cgroup events are always per-cpu events, >* this will always be called from the right CPU. >*/ > + /* > + * if only the cgroup is running on this cpu > + * and cpuctx->cgrp == NULL (otherwise it would've > + * been set with running cgroup), we put this cgroup > + * into cpu context. Or it would case mismatch in > + * following cgroup events at event_filter_match() > + */ Beyond making sure that what you comment on makes sense, please also follow existing comment style, which I quoted above. There's 3 separate mistakes in that paragraph alone. Thanks, Ingo
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-06 19:50 GMT+08:00 Peter Zijlstra: > On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiu...@gmail.com wrote: >> From: "leilei.lin" >> >> Do not install cgroup event into the CPU context and schedule it >> if the cgroup is not running on this CPU > > OK, so far so good, this explains the bit in > __perf_install_in_context(). > Actually, the new codes in __perf_install_in_context() only takes care whether if events should be scheduled with PMU. >> While there is no task of cgroup running specified CPU, current >> kernel still install cgroup event into CPU context that causes >> another cgroup event can't be installed into this CPU. >> >> This patch prevent scheduling events at __perf_install_in_context() >> and installing events at list_update_cgroup_event() if cgroup isn't >> running on specified CPU. > > This bit doesn't make sense, you don't in fact avoid anything in > list_update_cgroup_event(), you do more, not less. > And the new codes in list_update_cgroup_event() don't want cpuctx->cgrp to be set arbitrarily. The more logic, you mentioned, was added for making sure cpuctx->cgrp is set consistently with the cgroup running on the cpu. >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 4df5b69..f3ffa70 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, >> { >> struct perf_cpu_context *cpuctx; >> struct list_head *cpuctx_entry; >> + struct perf_cgroup *cgrp; >> >> if (!is_cgroup_event(event)) >> return; >> >> /* >>* Because cgroup events are always per-cpu events, >>* this will always be called from the right CPU. >>*/ >> cpuctx = __get_cpu_context(ctx); >> + cgrp = perf_cgroup_from_task(current, ctx); >> + >> + /* >> + * if only the cgroup is running on this cpu >> + * and cpuctx->cgrp == NULL (otherwise it would've >> + * been set with running cgroup), we put this cgroup >> + * into cpu context. Or it would case mismatch in >> + * following cgroup events at event_filter_match() >> + */ > > This is utterly incomprehensible, what? Yes, this is bit messy. I should've made it clear. This comment was supposed to explain the reason why I modified the if statement below. And the logic is 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it appropriately, that means, we __have to__ check if the cgroup is running on the cpu 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been set appropriately by cgroup_switch() or list_update_cgroup_event() before. Therefore, We do __nothing__ here > >> + if (add && !cpuctx->cgrp && >> + cgroup_is_descendant(cgrp->css.cgroup, >> + event->cgrp->css.cgroup)) { >> + cpuctx->cgrp = cgrp; >> + } > > And that's just horrible coding style. Maybe something like: > > if (add && cgroup_is_descendant(cgrp->css.cgroup, > event->cgrp->css.cgroup)) { > if (cpuctx->cgrp) > WARN_ON_ONCE(cpuctx->cgrp != cgrp); > cpuctx->cgrp = cgrp; > } > > that? But that still needs a comment to explain _why_ we do that here. > Under what condition would we fail to have cpuctx->cgrp set while > ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog. > if (cpuctx->cgrp == NULL) /* As I said above, we only take care this case. */ if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) { /* only when this cgroup is running */ cpuctx->cgrp = cgrp; } >> + >> + if (add && ctx->nr_cgroups++) >> + return; >> + else if (!add && --ctx->nr_cgroups) >> + return; >> >> + /* no cgroup running */ >> + if (!add) >> + cpuctx->cgrp = NULL; >> + >> + cpuctx_entry = >cgrp_cpuctx_entry; >> + if (add) >> list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); >> + else >> list_del(cpuctx_entry); >> } >> >> #else /* !CONFIG_CGROUP_PERF */ >> @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) >> raw_spin_lock(_ctx->lock); >> } >> >> +#ifdef CONFIG_CGROUP_PERF >> + if (is_cgroup_event(event)) { >> + /* >> + * Only care about cgroup events. >> + * > > That bit is entirely spurious, if it right after if (is_cgroup_event()), > obviously this block is only for cgroup events. > Totally, : ) >> + * If only the task belongs to cgroup of this event, >> + * we will continue the installment > > And that isn't really english. I think you meant to write something > like: > > /* > * If the current cgroup doesn't match the event's > * cgroup, we
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
2018-03-06 19:50 GMT+08:00 Peter Zijlstra : > On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiu...@gmail.com wrote: >> From: "leilei.lin" >> >> Do not install cgroup event into the CPU context and schedule it >> if the cgroup is not running on this CPU > > OK, so far so good, this explains the bit in > __perf_install_in_context(). > Actually, the new codes in __perf_install_in_context() only takes care whether if events should be scheduled with PMU. >> While there is no task of cgroup running specified CPU, current >> kernel still install cgroup event into CPU context that causes >> another cgroup event can't be installed into this CPU. >> >> This patch prevent scheduling events at __perf_install_in_context() >> and installing events at list_update_cgroup_event() if cgroup isn't >> running on specified CPU. > > This bit doesn't make sense, you don't in fact avoid anything in > list_update_cgroup_event(), you do more, not less. > And the new codes in list_update_cgroup_event() don't want cpuctx->cgrp to be set arbitrarily. The more logic, you mentioned, was added for making sure cpuctx->cgrp is set consistently with the cgroup running on the cpu. >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 4df5b69..f3ffa70 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, >> { >> struct perf_cpu_context *cpuctx; >> struct list_head *cpuctx_entry; >> + struct perf_cgroup *cgrp; >> >> if (!is_cgroup_event(event)) >> return; >> >> /* >>* Because cgroup events are always per-cpu events, >>* this will always be called from the right CPU. >>*/ >> cpuctx = __get_cpu_context(ctx); >> + cgrp = perf_cgroup_from_task(current, ctx); >> + >> + /* >> + * if only the cgroup is running on this cpu >> + * and cpuctx->cgrp == NULL (otherwise it would've >> + * been set with running cgroup), we put this cgroup >> + * into cpu context. Or it would case mismatch in >> + * following cgroup events at event_filter_match() >> + */ > > This is utterly incomprehensible, what? Yes, this is bit messy. I should've made it clear. This comment was supposed to explain the reason why I modified the if statement below. And the logic is 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it appropriately, that means, we __have to__ check if the cgroup is running on the cpu 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been set appropriately by cgroup_switch() or list_update_cgroup_event() before. Therefore, We do __nothing__ here > >> + if (add && !cpuctx->cgrp && >> + cgroup_is_descendant(cgrp->css.cgroup, >> + event->cgrp->css.cgroup)) { >> + cpuctx->cgrp = cgrp; >> + } > > And that's just horrible coding style. Maybe something like: > > if (add && cgroup_is_descendant(cgrp->css.cgroup, > event->cgrp->css.cgroup)) { > if (cpuctx->cgrp) > WARN_ON_ONCE(cpuctx->cgrp != cgrp); > cpuctx->cgrp = cgrp; > } > > that? But that still needs a comment to explain _why_ we do that here. > Under what condition would we fail to have cpuctx->cgrp set while > ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog. > if (cpuctx->cgrp == NULL) /* As I said above, we only take care this case. */ if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) { /* only when this cgroup is running */ cpuctx->cgrp = cgrp; } >> + >> + if (add && ctx->nr_cgroups++) >> + return; >> + else if (!add && --ctx->nr_cgroups) >> + return; >> >> + /* no cgroup running */ >> + if (!add) >> + cpuctx->cgrp = NULL; >> + >> + cpuctx_entry = >cgrp_cpuctx_entry; >> + if (add) >> list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); >> + else >> list_del(cpuctx_entry); >> } >> >> #else /* !CONFIG_CGROUP_PERF */ >> @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) >> raw_spin_lock(_ctx->lock); >> } >> >> +#ifdef CONFIG_CGROUP_PERF >> + if (is_cgroup_event(event)) { >> + /* >> + * Only care about cgroup events. >> + * > > That bit is entirely spurious, if it right after if (is_cgroup_event()), > obviously this block is only for cgroup events. > Totally, : ) >> + * If only the task belongs to cgroup of this event, >> + * we will continue the installment > > And that isn't really english. I think you meant to write something > like: > > /* > * If the current cgroup doesn't match the event's > * cgroup, we should not try to schedule it. >
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiu...@gmail.com wrote: > From: "leilei.lin"> > Do not install cgroup event into the CPU context and schedule it > if the cgroup is not running on this CPU OK, so far so good, this explains the bit in __perf_install_in_context(). > While there is no task of cgroup running specified CPU, current > kernel still install cgroup event into CPU context that causes > another cgroup event can't be installed into this CPU. > > This patch prevent scheduling events at __perf_install_in_context() > and installing events at list_update_cgroup_event() if cgroup isn't > running on specified CPU. This bit doesn't make sense, you don't in fact avoid anything in list_update_cgroup_event(), you do more, not less. > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 4df5b69..f3ffa70 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, > { > struct perf_cpu_context *cpuctx; > struct list_head *cpuctx_entry; > + struct perf_cgroup *cgrp; > > if (!is_cgroup_event(event)) > return; > > /* >* Because cgroup events are always per-cpu events, >* this will always be called from the right CPU. >*/ > cpuctx = __get_cpu_context(ctx); > + cgrp = perf_cgroup_from_task(current, ctx); > + > + /* > + * if only the cgroup is running on this cpu > + * and cpuctx->cgrp == NULL (otherwise it would've > + * been set with running cgroup), we put this cgroup > + * into cpu context. Or it would case mismatch in > + * following cgroup events at event_filter_match() > + */ This is utterly incomprehensible, what? > + if (add && !cpuctx->cgrp && > + cgroup_is_descendant(cgrp->css.cgroup, > + event->cgrp->css.cgroup)) { > + cpuctx->cgrp = cgrp; > + } And that's just horrible coding style. Maybe something like: if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) { if (cpuctx->cgrp) WARN_ON_ONCE(cpuctx->cgrp != cgrp); cpuctx->cgrp = cgrp; } that? But that still needs a comment to explain _why_ we do that here. Under what condition would we fail to have cpuctx->cgrp set while ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog. > + > + if (add && ctx->nr_cgroups++) > + return; > + else if (!add && --ctx->nr_cgroups) > + return; > > + /* no cgroup running */ > + if (!add) > + cpuctx->cgrp = NULL; > + > + cpuctx_entry = >cgrp_cpuctx_entry; > + if (add) > list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); > + else > list_del(cpuctx_entry); > } > > #else /* !CONFIG_CGROUP_PERF */ > @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) > raw_spin_lock(_ctx->lock); > } > > +#ifdef CONFIG_CGROUP_PERF > + if (is_cgroup_event(event)) { > + /* > + * Only care about cgroup events. > + * That bit is entirely spurious, if it right after if (is_cgroup_event()), obviously this block is only for cgroup events. > + * If only the task belongs to cgroup of this event, > + * we will continue the installment And that isn't really english. I think you meant to write something like: /* * If the current cgroup doesn't match the event's * cgroup, we should not try to schedule it. */ > + */ > + struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); > + reprogram = cgroup_is_descendant(cgrp->css.cgroup, > + event->cgrp->css.cgroup); > + } > +#endif > + > if (reprogram) { > ctx_sched_out(ctx, cpuctx, EVENT_TIME); > add_event_to_ctx(event, ctx); > -- > 2.8.4.31.g9ed660f >
Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiu...@gmail.com wrote: > From: "leilei.lin" > > Do not install cgroup event into the CPU context and schedule it > if the cgroup is not running on this CPU OK, so far so good, this explains the bit in __perf_install_in_context(). > While there is no task of cgroup running specified CPU, current > kernel still install cgroup event into CPU context that causes > another cgroup event can't be installed into this CPU. > > This patch prevent scheduling events at __perf_install_in_context() > and installing events at list_update_cgroup_event() if cgroup isn't > running on specified CPU. This bit doesn't make sense, you don't in fact avoid anything in list_update_cgroup_event(), you do more, not less. > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 4df5b69..f3ffa70 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, > { > struct perf_cpu_context *cpuctx; > struct list_head *cpuctx_entry; > + struct perf_cgroup *cgrp; > > if (!is_cgroup_event(event)) > return; > > /* >* Because cgroup events are always per-cpu events, >* this will always be called from the right CPU. >*/ > cpuctx = __get_cpu_context(ctx); > + cgrp = perf_cgroup_from_task(current, ctx); > + > + /* > + * if only the cgroup is running on this cpu > + * and cpuctx->cgrp == NULL (otherwise it would've > + * been set with running cgroup), we put this cgroup > + * into cpu context. Or it would case mismatch in > + * following cgroup events at event_filter_match() > + */ This is utterly incomprehensible, what? > + if (add && !cpuctx->cgrp && > + cgroup_is_descendant(cgrp->css.cgroup, > + event->cgrp->css.cgroup)) { > + cpuctx->cgrp = cgrp; > + } And that's just horrible coding style. Maybe something like: if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) { if (cpuctx->cgrp) WARN_ON_ONCE(cpuctx->cgrp != cgrp); cpuctx->cgrp = cgrp; } that? But that still needs a comment to explain _why_ we do that here. Under what condition would we fail to have cpuctx->cgrp set while ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog. > + > + if (add && ctx->nr_cgroups++) > + return; > + else if (!add && --ctx->nr_cgroups) > + return; > > + /* no cgroup running */ > + if (!add) > + cpuctx->cgrp = NULL; > + > + cpuctx_entry = >cgrp_cpuctx_entry; > + if (add) > list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); > + else > list_del(cpuctx_entry); > } > > #else /* !CONFIG_CGROUP_PERF */ > @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) > raw_spin_lock(_ctx->lock); > } > > +#ifdef CONFIG_CGROUP_PERF > + if (is_cgroup_event(event)) { > + /* > + * Only care about cgroup events. > + * That bit is entirely spurious, if it right after if (is_cgroup_event()), obviously this block is only for cgroup events. > + * If only the task belongs to cgroup of this event, > + * we will continue the installment And that isn't really english. I think you meant to write something like: /* * If the current cgroup doesn't match the event's * cgroup, we should not try to schedule it. */ > + */ > + struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); > + reprogram = cgroup_is_descendant(cgrp->css.cgroup, > + event->cgrp->css.cgroup); > + } > +#endif > + > if (reprogram) { > ctx_sched_out(ctx, cpuctx, EVENT_TIME); > add_event_to_ctx(event, ctx); > -- > 2.8.4.31.g9ed660f >
[PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
From: "leilei.lin"Do not install cgroup event into the CPU context and schedule it if the cgroup is not running on this CPU While there is no task of cgroup running specified CPU, current kernel still install cgroup event into CPU context that causes another cgroup event can't be installed into this CPU. This patch prevent scheduling events at __perf_install_in_context() and installing events at list_update_cgroup_event() if cgroup isn't running on specified CPU. Signed-off-by: leilei.lin --- v2: Set cpuctx->cgrp only if the same cgroup is running on this CPU otherwise following events couldn't be activated immediately v3: Enhance the comments and commit message v4: Adjust to config v5: Clear cpuctx->cgrp only when no cgroup event exists kernel/events/core.c | 54 +++- 1 file changed, 41 insertions(+), 13 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4df5b69..f3ffa70 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, { struct perf_cpu_context *cpuctx; struct list_head *cpuctx_entry; + struct perf_cgroup *cgrp; if (!is_cgroup_event(event)) return; - if (add && ctx->nr_cgroups++) - return; - else if (!add && --ctx->nr_cgroups) - return; /* * Because cgroup events are always per-cpu events, * this will always be called from the right CPU. */ cpuctx = __get_cpu_context(ctx); - cpuctx_entry = >cgrp_cpuctx_entry; - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/ - if (add) { - struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); + cgrp = perf_cgroup_from_task(current, ctx); + + /* +* if only the cgroup is running on this cpu +* and cpuctx->cgrp == NULL (otherwise it would've +* been set with running cgroup), we put this cgroup +* into cpu context. Or it would case mismatch in +* following cgroup events at event_filter_match() +*/ + if (add && !cpuctx->cgrp && + cgroup_is_descendant(cgrp->css.cgroup, + event->cgrp->css.cgroup)) { + cpuctx->cgrp = cgrp; + } + + if (add && ctx->nr_cgroups++) + return; + else if (!add && --ctx->nr_cgroups) + return; + /* no cgroup running */ + if (!add) + cpuctx->cgrp = NULL; + + cpuctx_entry = >cgrp_cpuctx_entry; + if (add) list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); - if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) - cpuctx->cgrp = cgrp; - } else { + else list_del(cpuctx_entry); - cpuctx->cgrp = NULL; - } } #else /* !CONFIG_CGROUP_PERF */ @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) raw_spin_lock(_ctx->lock); } +#ifdef CONFIG_CGROUP_PERF + if (is_cgroup_event(event)) { + /* +* Only care about cgroup events. +* +* If only the task belongs to cgroup of this event, +* we will continue the installment +*/ + struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); + reprogram = cgroup_is_descendant(cgrp->css.cgroup, + event->cgrp->css.cgroup); + } +#endif + if (reprogram) { ctx_sched_out(ctx, cpuctx, EVENT_TIME); add_event_to_ctx(event, ctx); -- 2.8.4.31.g9ed660f
[PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
From: "leilei.lin" Do not install cgroup event into the CPU context and schedule it if the cgroup is not running on this CPU While there is no task of cgroup running specified CPU, current kernel still install cgroup event into CPU context that causes another cgroup event can't be installed into this CPU. This patch prevent scheduling events at __perf_install_in_context() and installing events at list_update_cgroup_event() if cgroup isn't running on specified CPU. Signed-off-by: leilei.lin --- v2: Set cpuctx->cgrp only if the same cgroup is running on this CPU otherwise following events couldn't be activated immediately v3: Enhance the comments and commit message v4: Adjust to config v5: Clear cpuctx->cgrp only when no cgroup event exists kernel/events/core.c | 54 +++- 1 file changed, 41 insertions(+), 13 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4df5b69..f3ffa70 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event, { struct perf_cpu_context *cpuctx; struct list_head *cpuctx_entry; + struct perf_cgroup *cgrp; if (!is_cgroup_event(event)) return; - if (add && ctx->nr_cgroups++) - return; - else if (!add && --ctx->nr_cgroups) - return; /* * Because cgroup events are always per-cpu events, * this will always be called from the right CPU. */ cpuctx = __get_cpu_context(ctx); - cpuctx_entry = >cgrp_cpuctx_entry; - /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/ - if (add) { - struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); + cgrp = perf_cgroup_from_task(current, ctx); + + /* +* if only the cgroup is running on this cpu +* and cpuctx->cgrp == NULL (otherwise it would've +* been set with running cgroup), we put this cgroup +* into cpu context. Or it would case mismatch in +* following cgroup events at event_filter_match() +*/ + if (add && !cpuctx->cgrp && + cgroup_is_descendant(cgrp->css.cgroup, + event->cgrp->css.cgroup)) { + cpuctx->cgrp = cgrp; + } + + if (add && ctx->nr_cgroups++) + return; + else if (!add && --ctx->nr_cgroups) + return; + /* no cgroup running */ + if (!add) + cpuctx->cgrp = NULL; + + cpuctx_entry = >cgrp_cpuctx_entry; + if (add) list_add(cpuctx_entry, this_cpu_ptr(_cpuctx_list)); - if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) - cpuctx->cgrp = cgrp; - } else { + else list_del(cpuctx_entry); - cpuctx->cgrp = NULL; - } } #else /* !CONFIG_CGROUP_PERF */ @@ -2311,6 +2325,20 @@ static int __perf_install_in_context(void *info) raw_spin_lock(_ctx->lock); } +#ifdef CONFIG_CGROUP_PERF + if (is_cgroup_event(event)) { + /* +* Only care about cgroup events. +* +* If only the task belongs to cgroup of this event, +* we will continue the installment +*/ + struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); + reprogram = cgroup_is_descendant(cgrp->css.cgroup, + event->cgrp->css.cgroup); + } +#endif + if (reprogram) { ctx_sched_out(ctx, cpuctx, EVENT_TIME); add_event_to_ctx(event, ctx); -- 2.8.4.31.g9ed660f