On Fri, 13 Jun 2025 22:46:12 -0400 Steven Rostedt <rost...@goodmis.org> wrote:
> +/* > + * Deferred unwinding callback for per CPU events. > + * Note, the request for the deferred unwinding may have happened > + * on a different CPU. > + */ > +static void perf_event_deferred_cpu(struct unwind_work *work, > + struct unwind_stacktrace *trace, u64 > timestamp) > +{ > + struct perf_unwind_deferred *defer = > + container_of(work, struct perf_unwind_deferred, unwind_work); > + struct perf_unwind_cpu *cpu_events, *cpu_unwind; > + struct perf_event *event; > + int cpu; > + > + guard(rcu)(); > + guard(preempt)(); > + > + cpu = smp_processor_id(); > + cpu_events = rcu_dereference(defer->cpu_events); > + cpu_unwind = &cpu_events[cpu]; > + > + WRITE_ONCE(cpu_unwind->processing, 1); > + /* > + * Make sure the above is seen before the event->unwind_deferred > + * is checked. This matches the mb() in rcuwait_rcu_wait_event() in > + * perf_remove_unwind_deferred(). > + */ > + smp_mb(); > + > + list_for_each_entry_rcu(event, &cpu_unwind->list, unwind_list) { > + /* If unwind_deferred is NULL the event is going away */ > + if (unlikely(!event->unwind_deferred)) > + continue; > + perf_event_callchain_deferred(event, trace, timestamp); > + /* Only the first CPU event gets the trace */ > + break; > + } > + Hmm, I think I need a smp_mb() here too. > + WRITE_ONCE(cpu_unwind->processing, 0); > + rcuwait_wake_up(&cpu_unwind->pending_unwind_wait); > +} The first smp_mb() is for synchronizing removing of the event from perf_remove_unwind_deferred() that has: event->unwind_deferred = NULL; /* * Make sure perf_event_deferred_cpu() is done with this event. * That function will set cpu_unwind->processing and then * call smp_mb() before iterating the list of its events. * If the event's unwind_deferred is NULL, it will be skipped. * The smp_mb() in that function matches the mb() in * rcuwait_wait_event(). */ rcuwait_wait_event(&cpu_unwind->pending_unwind_wait, !cpu_unwind->processing, TASK_UNINTERRUPTIBLE); So that the unwind_deferred setting to NULL is seen before the cpu_unwind->processing is checked. But I think, in theory, without the smp_mb() before the clearing of the cpu_unwind->procssing that it can be seen before the unwind_deferred is read. CPU 0 CPU 1 ----- ----- read event->unwind_deferred write NULL > event->unwind_deferre smp_mb() (in rcuwait) CPU writes 0 > cpu_unwind->processing (re-ordered) reads cpu_unwind->processing == 0 Starts to free event Executes perf_event_callchain_deferred() I'll add another smp_mb() to be safe in v11. -- Steve