On Fri, 13 Jun 2025 22:46:12 -0400
Steven Rostedt <rost...@goodmis.org> wrote:

> +/*
> + * Deferred unwinding callback for per CPU events.
> + * Note, the request for the deferred unwinding may have happened
> + * on a different CPU.
> + */
> +static void perf_event_deferred_cpu(struct unwind_work *work,
> +                                 struct unwind_stacktrace *trace, u64 
> timestamp)
> +{
> +     struct perf_unwind_deferred *defer =
> +             container_of(work, struct perf_unwind_deferred, unwind_work);
> +     struct perf_unwind_cpu *cpu_events, *cpu_unwind;
> +     struct perf_event *event;
> +     int cpu;
> +
> +     guard(rcu)();
> +     guard(preempt)();
> +
> +     cpu = smp_processor_id();
> +     cpu_events = rcu_dereference(defer->cpu_events);
> +     cpu_unwind = &cpu_events[cpu];
> +
> +     WRITE_ONCE(cpu_unwind->processing, 1);
> +     /*
> +      * Make sure the above is seen before the event->unwind_deferred
> +      * is checked. This matches the mb() in rcuwait_rcu_wait_event() in
> +      * perf_remove_unwind_deferred().
> +      */
> +     smp_mb();
> +
> +     list_for_each_entry_rcu(event, &cpu_unwind->list, unwind_list) {
> +             /* If unwind_deferred is NULL the event is going away */
> +             if (unlikely(!event->unwind_deferred))
> +                     continue;
> +             perf_event_callchain_deferred(event, trace, timestamp);
> +             /* Only the first CPU event gets the trace */
> +             break;
> +     }
> +

Hmm, I think I need a smp_mb() here too.

> +     WRITE_ONCE(cpu_unwind->processing, 0);
> +     rcuwait_wake_up(&cpu_unwind->pending_unwind_wait);
> +}

The first smp_mb() is for synchronizing removing of the event from
perf_remove_unwind_deferred() that has:

        event->unwind_deferred = NULL;

        /*
         * Make sure perf_event_deferred_cpu() is done with this event.
         * That function will set cpu_unwind->processing and then
         * call smp_mb() before iterating the list of its events.
         * If the event's unwind_deferred is NULL, it will be skipped.
         * The smp_mb() in that function matches the mb() in
         * rcuwait_wait_event().
         */
        rcuwait_wait_event(&cpu_unwind->pending_unwind_wait,
                                   !cpu_unwind->processing, 
TASK_UNINTERRUPTIBLE);


So that the unwind_deferred setting to NULL is seen before the
cpu_unwind->processing is checked. But I think, in theory, without the
smp_mb() before the clearing of the cpu_unwind->procssing that it can
be seen before the unwind_deferred is read.

  CPU 0                                    CPU 1
  -----                                    -----
read event->unwind_deferred

                                        write NULL > event->unwind_deferre
                                        smp_mb() (in rcuwait)

CPU writes 0 > cpu_unwind->processing (re-ordered)

                                        reads cpu_unwind->processing == 0
                                        Starts to free event

Executes perf_event_callchain_deferred()


I'll add another smp_mb() to be safe in v11.

-- Steve





Reply via email to