On Fri, Jul 01, 2016 at 11:40:54AM -0700, Paul E. McKenney wrote: > On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > > +/* > > > + * Wake up the specified CPU. If the CPU is going offline, it is the > > > + * caller's responsibility to deal with the lost wakeup, for example, > > > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > > > + */ > > > void wake_up_nohz_cpu(int cpu) > > > { > > > - if (!wake_up_full_nohz_cpu(cpu)) > > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible > > in the domains > > anymore (correct me if I'm wrong), therefore get_nohz_timer_target() can't > > return it, > > unless smp_processor_id() is the only alternative. > > Right, but the timers have been posted long before even CPU_UP_PREPARE. > From what I can see, they are left alone until CPU_DEAD. Which means > that if you try to mod_timer() them between CPU_DYING and CPU_DEAD, > you can get the above splat. > > Or am I missing somthing subtle here?
Yes that's exactly what I meant. It happens on mod_timer() calls between CPU_DYING and CPU_DEAD. I just wanted to clarify the conditions for it to happen: the fact that it shouldn't concern remote CPU targets, only local pinned timers. > > Hence, that call to wake_up_nohz_cpu() can only happen to online CPUs or > > the current > > one (pinned). And wake_up_idle_cpu() on the current CPU is a no-op. So only > > wake_up_full_nohz_cpu() is concerned. Then perhaps it would be better to > > move that > > cpu_online() check to wake_up_full_nohz_cpu() ? > > As in the patch shown below? Either way works for me. Hmm, the patch doesn't seem to be different than the previous one :-) > > > BTW, it seems that rcutorture stops its kthreads after CPU_DYING, is it > > expected that > > it queues timers at this stage? > > Hmmm... From what I can see, rcutorture cleans up its priority-boost > kthreads at CPU_DOWN_PREPARE time. The other threads are allowed to > migrate wherever the scheduler wants, give or take the task shuffling. > The task shuffling only excludes one CPU at a time, and I have seen > this occur when multiple CPUs were running, e.g., 0, 2, and 3 while > offlining 1. But if rcutorture kthreads are cleaned up at CPU_DOWN_PREPARE, they shouldn't be calling mod_timer() on CPU_DYING time. Or there are other rcutorture threads? > > Besides which, doesn't the scheduler prevent anything but the idle > thread from running after CPU_DYING time? Indeed migrate_tasks() is called on CPU_DYING but pinned kthreads, outside smpboot, have their own way to deal with hotplug through notifiers. Thanks. > > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 7f2cae4620c7..08502966e7df 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) > return false; > } > > +/* > + * Wake up the specified CPU. If the CPU is going offline, it is the > + * caller's responsibility to deal with the lost wakeup, for example, > + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. > + */ > void wake_up_nohz_cpu(int cpu) > { > - if (!wake_up_full_nohz_cpu(cpu)) > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > wake_up_idle_cpu(cpu); > } > >