On Mon, May 01, 2017 at 02:44:02PM -0400, Tejun Heo wrote: > Hello, Paul. > > On Mon, May 01, 2017 at 11:38:07AM -0700, Paul E. McKenney wrote: > > On Mon, May 01, 2017 at 09:57:47AM -0700, Paul E. McKenney wrote: > > > Hello! > > > > > > I am hitting this WARN_ON_ONCE() in process_one_work() and am wondering > > > what I did wrong to make this happen: > > > > Oh, wait... Rescuer, it says. Might this be due to the fact that RCU's > > expedited grace periods block within a workqueue handler? Might this > > in turn run the system out of workqueue kthreads? If this is the likely > > cause, my approach would be to rework the expected-grace-period workqueue > > handler to return when waiting for the grace period to complete, and to > > replace the current wakeup with a schedule_work() or something similar. > > That should be completely fine. It could just be that the rescuer > path has a bug around CPU hotplug handling. Can you please confirm > either way on the cpuset usage?
I have no explicit cpuset usage or affinity of the workqueue handlers themselves. However, this is thus far only happening in CONFIG_NO_HZ_FULL=y runs, in this case, with the kernel boot parameter nohz_full=2-9 out of 16 CPUs. IIRC, this sets up a "housekeeping" cpuset that pushes normal tasks away from the nohz_full CPUs. I do build with CONFIG_HOTPLUG_CPU=y, and the test does a lot of hotplugging. Also, other kthreads (but again, not the workqueue handlers) do a lot of explicit CPU-affinity manipulation. Thanx, Paul