On Mon, Jul 07, 2025 at 10:19:52AM -0400, Joel Fernandes wrote: > From: Joel Fernandes <joelagn...@nvidia.com> > Subject: [PATCH] smp: Document preemption and stop_machine() mutual exclusion > > Recently while revising RCU's cpu online checks, there was some discussion > around how IPIs synchronize with hotplug. > > Add comments explaining how preemption disable creates mutual exclusion with > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine() > atomically updates CPU masks and flushes IPIs with interrupts disabled, and > cannot proceed while any CPU (including the IPI sender) has preemption > disabled. > > Cc: Andrea Righi <ari...@nvidia.com> > Cc: Paul E. McKenney <paul...@kernel.org> > Cc: Frederic Weisbecker <frede...@kernel.org> > Cc: r...@vger.kernel.org > Acked-by: Paul E. McKenney <paul...@kernel.org> > Co-developed-by: Frederic Weisbecker <frede...@kernel.org> > Signed-off-by: Joel Fernandes <joelagn...@nvidia.com> > --- > I am leaving in Paul's Ack but Paul please let me know if there is a concern! > > kernel/smp.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/kernel/smp.c b/kernel/smp.c > index 974f3a3962e8..957959031063 100644 > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu) > * explicitly (without waiting for the IPIs to arrive), to > * ensure that the outgoing CPU doesn't go offline with work > * still pending. > + * > + * This runs with interrupts disabled inside the stopper task invoked > + * by stop_machine(), ensuring CPU offlining and IPI flushing are > atomic.
So below you use 'mutual exclusion', which I prefer over 'atomic' as used here. > */ > __flush_smp_call_function_queue(false); > irq_work_run(); > @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node > *node) > */ > static int generic_exec_single(int cpu, call_single_data_t *csd) > { > + /* > + * Preemption already disabled here so stopper cannot run on this CPU, > + * ensuring mutual exclusion with CPU offlining and last IPI flush. > + */ > if (cpu == smp_processor_id()) { > smp_call_func_t func = csd->func; > void *info = csd->info; > @@ -638,8 +645,10 @@ int smp_call_function_single(int cpu, smp_call_func_t > func, void *info, > int err; > > /* > - * prevent preemption and reschedule on another processor, > - * as well as CPU removal > + * Prevent preemption and reschedule on another processor, as well as > + * CPU removal. > Also preempt_disable() prevents stopper from running on > + * this CPU, thus providing atomicity between the cpu_online() check > + * and IPI sending ensuring IPI is not missed by CPU going offline. That first sentence already covers this, no? 'prevents preemption' -> stopper task cannot run, 'CPU removal' -> no CPU_DYING (because no stopper). Also that 'atomicy' vs 'mutual exclusion' thing. > */ > this_cpu = get_cpu(); > > -- > 2.34.1 >