On Mon, Jul 07, 2014 at 11:04:28AM +0200, Peter Zijlstra wrote:
> On Wed, Jun 25, 2014 at 08:44:35PM +0200, Jiri Olsa wrote:
> > From: Jiri Olsa <[email protected]>
> > 
> > While iterating siblings in perf_output_read_group we could
> > race with addition and removal of sibling in perf_group_attach
> > and perf_group_detach respective.
> 
> So why would anybody do this?

the test program from 0/1 email hangs up my server
but no standard reason AFAICS

> 
> > While in perf_output_read_group we are under active context,
> > so the only sibling_list modification could come via IPI in:
> >   perf_install_in_context or perf_remove_from_context
> > 
> > Disable interrupts before iterating siblings to prevent
> > this race.
> > 
> > Cc: Arnaldo Carvalho de Melo <[email protected]>
> > Cc: Corey Ashford <[email protected]>
> > Cc: Frederic Weisbecker <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Paul Mackerras <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Signed-off-by: Jiri Olsa <[email protected]>
> > ---
> >  kernel/events/core.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index a33d9a2b..66649d3 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -4509,6 +4509,7 @@ static void perf_output_read_group(struct 
> > perf_output_handle *handle,
> >  {
> >     struct perf_event *leader = event->group_leader, *sub;
> >     u64 read_format = event->attr.read_format;
> > +   unsigned long flags;
> >     u64 values[5];
> >     int n = 0;
> >  
> > @@ -4529,6 +4530,15 @@ static void perf_output_read_group(struct 
> > perf_output_handle *handle,
> >  
> >     __output_copy(handle, values, n * sizeof(u64));
> >  
> > +   /*
> > +    * We are now under active context, so the only sibling_list
> > +    * modification could come via IPI in:
> > +    *   perf_install_in_context and perf_remove_from_context
> > +    *
> > +    * Disable interrupts to prevent this race.
> > +    */
> > +   local_irq_save(flags);
> 
> I think this is too late; you want it right at the beginning, before we
> read ->nr_siblings, as that is also changed by
> add_event_to_ctx()->perf_group_attach().
> 
> That said; it would be nice not to have to poke at the interrupt flag,
> its expensive.

right.. I'll check if we could use the rcu loop/locking here

> 
> So is this really a problem, or just a case of: if you do silly things,
> you get silly results?

I've got soft lockup, sometimes ended up with unkillable perf process
also few total server hangs

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to