On Tue, Jan 17, 2017 at 01:11:46PM +0100, Michal Hocko wrote: > On Tue 17-01-17 04:05:13, Paul E. McKenney wrote: > > On Tue, Jan 17, 2017 at 11:51:41AM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 16:54:03, Paul E. McKenney wrote: > > > > On Mon, Jan 16, 2017 at 06:11:30PM +0100, Peter Zijlstra wrote: > > > > > On Sat, Jan 14, 2017 at 01:13:12AM -0800, Paul E. McKenney wrote: > > > > > > There is some confusion as to which of cond_resched() or > > > > > > cond_resched_rcu_qs() should be added to long in-kernel loops. > > > > > > This commit therefore eliminates the decision by adding RCU > > > > > > quiescent states to cond_resched(). > > > > > > > > > > Which would make: rcu_read_lock(); cond_resched(); rcu_read_unlock(); > > > > > invalid under preemptible RCU. Is it already? > > > > > > > > In theory, yes. In practice, I just tested it with preemption and > > > > lockdep enabled, and it didn't complain. If further testing finds > > > > complaints, we can either fix those uses (preferred) or revert > > > > this patch. > > > > > > > > > > Warning: This is a prototype. For example, it does not correctly > > > > > > handle Tasks RCU. Which is OK for the moment, given that no one > > > > > > actually uses Tasks RCU yet. > > > > > > > > > > > --- a/kernel/sched/core.c > > > > > > +++ b/kernel/sched/core.c > > > > > > @@ -4907,6 +4907,7 @@ int __sched _cond_resched(void) > > > > > > preempt_schedule_common(); > > > > > > return 1; > > > > > > } > > > > > > + rcu_all_qs(); > > > > > > return 0; > > > > > > } > > > > > > > > > > Still not a real fan of this, it does make cond_resched() touch a > > > > > bunch > > > > > more cachelines, also, I suppose that if we're going to do this, we > > > > > should make __cond_resched_lock() and __cond_resched_softirq() act > > > > > similarly. > > > > > > > > Michal (now CCed) argues that having to distinguish between > > > > cond_resched() > > > > and cond_resched_rcu_qs() is overly burdensome. Michal? > > > > > > Yes, it is really not clear which one is meant to be in which context. I > > > really do not see which cond_resched should be turned intto > > > cond_resched_rcu_qs. > > > > > > > Any thoughts on how we might remove this burden without the additional > > > > cache misses? I will take another look as well to see what could make > > > > it lower cost. There are probably ways... Would it make sense to > > > > have RCU maintain a need-rcu_all_qs() flage in the same cacheline as > > > > the __preempt_count? Perhaps throttling the writes to this flag from > > > > the RCU grace-period kthreads to once per 100 milliseconds or so? > > > > > > Can the stall detector simply request rescheduling when it gets > > > dangerously close to the timeout? > > > > It is quite possible that half of the stall timeout would be a better > > choice than my 100 milliseconds, but either way, there would be need > > for a flag or some such. > > E.g. set_tsk_need_resched() on the task currently running on a cpu which > is preventing the rcu grace period for too long? > > That would only require change to the stall detector and the cond_resched > could be left alone completely.
Thank you!!! The other complication is that under CONFIG_PREEMPT=y, _cond_resched() is an empty function. That would be one reason why use of cond_resched() wasn't always giving RCU the quiescent states that it needs. And that is a problem with this patch, which I therefore need to defer to 4.12. That aside, the reason I am reluctant to use the need-resched approach except as an emergency measure is that the way I have to set that bit remotely involves IPIs. But don't get me wrong, it is extremely useful as an emergency meaure. I am just trying to get cond_resched() to help on a non-emergency basis. Thanx, Paul