On Mon, Jul 07, 2025 at 08:56:04AM -0700, Paul E. McKenney wrote: > On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote: > > On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote: > > > Recently while revising RCU's cpu online checks, there was some discussion > > > around how IPIs synchronize with hotplug. > > > > > > Add comments explaining how preemption disable creates mutual exclusion > > > with > > > CPU hotplug's stop_machine mechanism. The key insight is that > > > stop_machine() > > > atomically updates CPU masks and flushes IPIs with interrupts disabled, > > > and > > > cannot proceed while any CPU (including the IPI sender) has preemption > > > disabled. > > > > I'm very conflicted on this. While the added comments aren't wrong, > > they're not quite accurate either. Stop_machine doesn't wait for people > > to enable preemption as such. > > > > Fundamentally there seems to be a misconception around what stop machine > > is and how it works, and I don't feel these comments make things better. > > > > Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the > > stopper task, a task running at the ultimate priority; if it is > > runnable, it will run. > > > > Stop-machine simply wakes all the stopper tasks and co-ordinates them to > > literally stop the machine. All CPUs have the stopper task scheduled and > > then they go sit in a spin-loop driven state machine with IRQs disabled. > > > > There really isn't anything magical about any of this. > > There is the mechanism (which you have described above), and then there > are the use cases. Those of us maintaining a given mechanism might > argue that a detailed description of the mechanism suffices, but that > argument does not always win the day. > > I do like the description in the stop_machine() kernel-doc header: > > * This can be thought of as a very heavy write lock, equivalent to > * grabbing every spinlock in the kernel. > > Though doesn't this need to upgrace "spinlock" to "raw spinlock" > now that PREEMPT_RT is in mainline? > > Also, this function is more powerful than grabbing every write lock > in the kernel because it also excludes all regions of code that have > preemption disabled, which is one thing that CPU hotplug is relying on. > Any objection to calling out that additional semantic?
Best to just re-formulate the entire comment I think. State it provides exclusion vs all non-preemptible regions in the kernel -- at insane cost and should not be used when humanly possible :-)