Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Wed, Jun 10, 2020 at 04:21:42PM -0700, Paul E. McKenney wrote: [...] > > > 8.To softirq 3. Either GP or CB kthread for the transitioning > > > CPU advances to next. > > > At this point, the no-CBs setup is fully shut down. > > > 9.To softirq 4. Transitioning code advances to next, > > > which is the first, "In softirq". > > > (This one -might- be unnecessary, but...) > > > > > > All transitions are of course with the ->nocb_lock held. > > > > > > When there is only one CPU during early boot near rcu_init() time, > > > the transition from "In softirq" to "No-CB" can remain be instantaneous. > > > > > > This has the advantage of not slowing things down just because there > > > is an RCU callback flood in progress. It also uses an explicit > > > protocol that should (give or take bugs) maintain full safety both > > > in protection of ->cblist and in dealing with RCU callback floods. > > > > > > Thoughts? > > > > Agreed. And I really like that it details the whole process in a very > > explicit way. > > > > Thanks! > > Glad you like it! And of course please adjust it as needed, up to and > including doing something completely different that works better. ;-) Makes sense to me too, thanks! - Joel
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Thu, Jun 11, 2020 at 12:12:46AM +0200, Frederic Weisbecker wrote: > On Wed, Jun 10, 2020 at 07:02:10AM -0700, Paul E. McKenney wrote: > > And just to argue against myself... > > > > Another approach is to maintain explicit multiple states for each > > ->cblist, perhaps something like this: > > > > 1. In softirq. Transition code advances to next. > > 2. To no-CB 1. Either GP or CB kthread for the transitioning > > CPU advances to next. Note that the fact that the > > transition code runs on the transitioning CPU means that > > the RCU softirq handler doesn't need to be involved. > > 3. To no-CB 2. Either GP or CB kthread for the transitioning > > CPU advances to next. > > Just to clarify, if GP has set NO_CB2 in (2), we want CB to set NO_CB3 > in 3), right? OTOH if CB has set NO_CB2 in (2), we want GP to set NO_CB3 > in (3), right? > > The point being to make sure that both threads acknowledge the transition? Exactly! > > 4. To no-CB 3. Transitioning code advances to next. > > At this point, the no-CBs setup is fully functional. > > And softirq can stop processing callbacks from that point on. You got it! > > 5. No-CB. Transitioning code advances to next. > > Again, the fact that the transitioning code is running > > on the transitioning CPU with interrupts disabled means > > that the RCU softirq handler need not be explicitly > > involved. > > 6. To softirq 1. The RCU softirq handler for the transitioning > > CPU advances to next. > > At this point, the RCU softirq handler is processing callbacks. > > 7. To softirq 2. Either GP or CB kthread for the transitioning > > CPU advances to next. > > At this point, the softirq handler is processing callbacks. > > SOFTIRQ2 should be part of what happens in SOFTIRQ1. The transitioning > CPU sets SOFTIRQ1, which is immediately visible by local softirqs, > and wakes up CB/GP. CB/GP sets SOFTIRQ2, CB/GP sets SOFTIRQ3 and > we go back to transitioning code that sets IN_SOFTIRQ. > > Or did I miss something? I was perhaps being overly paranoid. You might well be right. > > 8. To softirq 3. Either GP or CB kthread for the transitioning > > CPU advances to next. > > At this point, the no-CBs setup is fully shut down. > > 9. To softirq 4. Transitioning code advances to next, > > which is the first, "In softirq". > > (This one -might- be unnecessary, but...) > > > > All transitions are of course with the ->nocb_lock held. > > > > When there is only one CPU during early boot near rcu_init() time, > > the transition from "In softirq" to "No-CB" can remain be instantaneous. > > > > This has the advantage of not slowing things down just because there > > is an RCU callback flood in progress. It also uses an explicit > > protocol that should (give or take bugs) maintain full safety both > > in protection of ->cblist and in dealing with RCU callback floods. > > > > Thoughts? > > Agreed. And I really like that it details the whole process in a very > explicit way. > > Thanks! Glad you like it! And of course please adjust it as needed, up to and including doing something completely different that works better. ;-) Thanx, Paul
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Wed, Jun 10, 2020 at 07:02:10AM -0700, Paul E. McKenney wrote: > And just to argue against myself... > > Another approach is to maintain explicit multiple states for each > ->cblist, perhaps something like this: > > 1.In softirq. Transition code advances to next. > 2.To no-CB 1. Either GP or CB kthread for the transitioning > CPU advances to next. Note that the fact that the > transition code runs on the transitioning CPU means that > the RCU softirq handler doesn't need to be involved. > 3.To no-CB 2. Either GP or CB kthread for the transitioning > CPU advances to next. Just to clarify, if GP has set NO_CB2 in (2), we want CB to set NO_CB3 in 3), right? OTOH if CB has set NO_CB2 in (2), we want GP to set NO_CB3 in (3), right? The point being to make sure that both threads acknowledge the transition? > 4.To no-CB 3. Transitioning code advances to next. > At this point, the no-CBs setup is fully functional. And softirq can stop processing callbacks from that point on. > 5.No-CB. Transitioning code advances to next. > Again, the fact that the transitioning code is running > on the transitioning CPU with interrupts disabled means > that the RCU softirq handler need not be explicitly > involved. > 6.To softirq 1. The RCU softirq handler for the transitioning > CPU advances to next. > At this point, the RCU softirq handler is processing callbacks. > 7.To softirq 2. Either GP or CB kthread for the transitioning > CPU advances to next. > At this point, the softirq handler is processing callbacks. SOFTIRQ2 should be part of what happens in SOFTIRQ1. The transitioning CPU sets SOFTIRQ1, which is immediately visible by local softirqs, and wakes up CB/GP. CB/GP sets SOFTIRQ2, CB/GP sets SOFTIRQ3 and we go back to transitioning code that sets IN_SOFTIRQ. Or did I miss something? > 8.To softirq 3. Either GP or CB kthread for the transitioning > CPU advances to next. > At this point, the no-CBs setup is fully shut down. > 9.To softirq 4. Transitioning code advances to next, > which is the first, "In softirq". > (This one -might- be unnecessary, but...) > > All transitions are of course with the ->nocb_lock held. > > When there is only one CPU during early boot near rcu_init() time, > the transition from "In softirq" to "No-CB" can remain be instantaneous. > > This has the advantage of not slowing things down just because there > is an RCU callback flood in progress. It also uses an explicit > protocol that should (give or take bugs) maintain full safety both > in protection of ->cblist and in dealing with RCU callback floods. > > Thoughts? Agreed. And I really like that it details the whole process in a very explicit way. Thanks!
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Wed, Jun 10, 2020 at 03:12:39PM +0200, Frederic Weisbecker wrote: > On Tue, Jun 09, 2020 at 11:02:27AM -0700, Paul E. McKenney wrote: > > > > > And anyway we still want to unconditionally lock on many places, > > > > > regardless of the offloaded state. I don't know how we could have > > > > > a magic helper do the unconditional lock on some places and the > > > > > conditional on others. > > > > > > > > I was assuming (perhaps incorrectly) that an intermediate phase between > > > > not-offloaded and offloaded would take care of all of those cases. > > > > > > Perhaps partly but I fear that won't be enough. > > > > One approach is to rely on RCU read-side critical sections surrounding > > the lock acquisition and to stay in the intermediate phase until a grace > > period completes, preferably call_rcu() instead of synchronize_rcu(). > > > > This of course means refusing to do a transition if the CPU is still > > in the intermediate state from a prior transition. > > That sounds good. But using synchronize_rcu() would be far easier. We > need to keep the hotplug and rcu barrier locked during the transition. > > > > Also I've been thinking that rcu_nocb_lock() should meet any of these > > > requirements: > > > > > > * hotplug is locked > > > * rcu barrier is locked > > > * rnp is locked > > > > > > Because checking the offloaded state (when nocb isn't locked yet) of > > > an rdp without any of the above locks held is racy. And that should > > > be easy to check and prevent from copy-pasta accidents. > > > > > > What do you think? > > > > An RCU read-side critical section might be simpler. > > Ok I think I can manage that. And just to argue against myself... Another approach is to maintain explicit multiple states for each ->cblist, perhaps something like this: 1. In softirq. Transition code advances to next. 2. To no-CB 1. Either GP or CB kthread for the transitioning CPU advances to next. Note that the fact that the transition code runs on the transitioning CPU means that the RCU softirq handler doesn't need to be involved. 3. To no-CB 2. Either GP or CB kthread for the transitioning CPU advances to next. 4. To no-CB 3. Transitioning code advances to next. At this point, the no-CBs setup is fully functional. 5. No-CB. Transitioning code advances to next. Again, the fact that the transitioning code is running on the transitioning CPU with interrupts disabled means that the RCU softirq handler need not be explicitly involved. 6. To softirq 1. The RCU softirq handler for the transitioning CPU advances to next. At this point, the RCU softirq handler is processing callbacks. 7. To softirq 2. Either GP or CB kthread for the transitioning CPU advances to next. At this point, the softirq handler is processing callbacks. 8. To softirq 3. Either GP or CB kthread for the transitioning CPU advances to next. At this point, the no-CBs setup is fully shut down. 9. To softirq 4. Transitioning code advances to next, which is the first, "In softirq". (This one -might- be unnecessary, but...) All transitions are of course with the ->nocb_lock held. When there is only one CPU during early boot near rcu_init() time, the transition from "In softirq" to "No-CB" can remain be instantaneous. This has the advantage of not slowing things down just because there is an RCU callback flood in progress. It also uses an explicit protocol that should (give or take bugs) maintain full safety both in protection of ->cblist and in dealing with RCU callback floods. Thoughts? Thanx, Paul
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, Jun 09, 2020 at 11:02:27AM -0700, Paul E. McKenney wrote: > > > > And anyway we still want to unconditionally lock on many places, > > > > regardless of the offloaded state. I don't know how we could have > > > > a magic helper do the unconditional lock on some places and the > > > > conditional on others. > > > > > > I was assuming (perhaps incorrectly) that an intermediate phase between > > > not-offloaded and offloaded would take care of all of those cases. > > > > Perhaps partly but I fear that won't be enough. > > One approach is to rely on RCU read-side critical sections surrounding > the lock acquisition and to stay in the intermediate phase until a grace > period completes, preferably call_rcu() instead of synchronize_rcu(). > > This of course means refusing to do a transition if the CPU is still > in the intermediate state from a prior transition. That sounds good. But using synchronize_rcu() would be far easier. We need to keep the hotplug and rcu barrier locked during the transition. > > Also I've been thinking that rcu_nocb_lock() should meet any of these > > requirements: > > > > * hotplug is locked > > * rcu barrier is locked > > * rnp is locked > > > > Because checking the offloaded state (when nocb isn't locked yet) of > > an rdp without any of the above locks held is racy. And that should > > be easy to check and prevent from copy-pasta accidents. > > > > What do you think? > > An RCU read-side critical section might be simpler. Ok I think I can manage that. Thanks.
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Mon, Jun 08, 2020 at 02:57:17PM +0200, Frederic Weisbecker wrote: > On Thu, Jun 04, 2020 at 09:36:55AM -0700, Paul E. McKenney wrote: > > On Thu, Jun 04, 2020 at 01:41:22PM +0200, Frederic Weisbecker wrote: > > > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > > > Reviewed-by: Joel Fernandes (Google) > > > > > > > > Thank you for looking this over, Joel! > > > > > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > > > right thing", even when things are changing? If it is feasible, that > > > > would prevent any number of "interesting" copy-pasta and "just now > > > > became > > > > common code" bugs down the road. > > > > > > This won't be pretty: > > > > > > locked = rcu_nocb_lock(); > > > rcu_nocb_unlock(locked); > > > > I was thinking in terms of a bit in the rcu_data structure into > > which rcu_nocb_lock() and friends stored the status, and from which > > rcu_nocb_unlock() and friends retrieved that same status. Sort of like > > how preemptible RCU uses the ->rcu_read_lock_nesting field in task_struct. > > > > As noted, this does require reworking the hotplug code to avoid the > > current holding of two such locks concurrently, which I am happy to do > > if that helps. > > > > Or am I missing a subtle (or not-so-subtle) twist here? > > So, during a CPU round, the nocb-gp kthread locks the corresponding rdp-nocb > and then ignores if it is not offloaded. Surely there is a smarter ways to do > that though. Well that's something we'll need to optimize at some point > anyway. > > Also we must then make sure that the nocb timers won't ever fire while we > switch to offloaded state or they may fiddle with internals without locking > nocb. Understood, the synchronization requirements during the switchover will be tricky. But that will be the case no matter how we set it up. If possible, it would be good if that trickiness could be concentrated rather than spread far and wide. > > > And anyway we still want to unconditionally lock on many places, > > > regardless of the offloaded state. I don't know how we could have > > > a magic helper do the unconditional lock on some places and the > > > conditional on others. > > > > I was assuming (perhaps incorrectly) that an intermediate phase between > > not-offloaded and offloaded would take care of all of those cases. > > Perhaps partly but I fear that won't be enough. One approach is to rely on RCU read-side critical sections surrounding the lock acquisition and to stay in the intermediate phase until a grace period completes, preferably call_rcu() instead of synchronize_rcu(). This of course means refusing to do a transition if the CPU is still in the intermediate state from a prior transition. > > > Also the point of turning the lock helpers into primitives is to make > > > it clearer as to where we really need unconditional locking and where > > > we allow it to be conditional. A gift to reviewers :-) > > > > Unless and until someone does a copy-pasta, thus unconditionally > > doing the wrong thing. ;-) > > Yeah but the cost is more complicated code and synchronization to > maintain the use of those all-in-one locking APIs everywhere. And > current state is already not that simple :-) > > Perhaps we should rename rcu_nocb_lock() into rcu_nocb_lock_cond() to > prevent from accidents? We do need something to avoid accidents. ;-) > Also I've been thinking that rcu_nocb_lock() should meet any of these > requirements: > > * hotplug is locked > * rcu barrier is locked > * rnp is locked > > Because checking the offloaded state (when nocb isn't locked yet) of > an rdp without any of the above locks held is racy. And that should > be easy to check and prevent from copy-pasta accidents. > > What do you think? An RCU read-side critical section might be simpler. Thanx, Paul
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Thu, Jun 04, 2020 at 09:36:55AM -0700, Paul E. McKenney wrote: > On Thu, Jun 04, 2020 at 01:41:22PM +0200, Frederic Weisbecker wrote: > > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > > Reviewed-by: Joel Fernandes (Google) > > > > > > Thank you for looking this over, Joel! > > > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > > right thing", even when things are changing? If it is feasible, that > > > would prevent any number of "interesting" copy-pasta and "just now became > > > common code" bugs down the road. > > > > This won't be pretty: > > > > locked = rcu_nocb_lock(); > > rcu_nocb_unlock(locked); > > I was thinking in terms of a bit in the rcu_data structure into > which rcu_nocb_lock() and friends stored the status, and from which > rcu_nocb_unlock() and friends retrieved that same status. Sort of like > how preemptible RCU uses the ->rcu_read_lock_nesting field in task_struct. > > As noted, this does require reworking the hotplug code to avoid the > current holding of two such locks concurrently, which I am happy to do > if that helps. > > Or am I missing a subtle (or not-so-subtle) twist here? So, during a CPU round, the nocb-gp kthread locks the corresponding rdp-nocb and then ignores if it is not offloaded. Surely there is a smarter ways to do that though. Well that's something we'll need to optimize at some point anyway. Also we must then make sure that the nocb timers won't ever fire while we switch to offloaded state or they may fiddle with internals without locking nocb. > > > And anyway we still want to unconditionally lock on many places, > > regardless of the offloaded state. I don't know how we could have > > a magic helper do the unconditional lock on some places and the > > conditional on others. > > I was assuming (perhaps incorrectly) that an intermediate phase between > not-offloaded and offloaded would take care of all of those cases. Perhaps partly but I fear that won't be enough. > > > Also the point of turning the lock helpers into primitives is to make > > it clearer as to where we really need unconditional locking and where > > we allow it to be conditional. A gift to reviewers :-) > > Unless and until someone does a copy-pasta, thus unconditionally > doing the wrong thing. ;-) Yeah but the cost is more complicated code and synchronization to maintain the use of those all-in-one locking APIs everywhere. And current state is already not that simple :-) Perhaps we should rename rcu_nocb_lock() into rcu_nocb_lock_cond() to prevent from accidents? Also I've been thinking that rcu_nocb_lock() should meet any of these requirements: * hotplug is locked * rcu barrier is locked * rnp is locked Because checking the offloaded state (when nocb isn't locked yet) of an rdp without any of the above locks held is racy. And that should be easy to check and prevent from copy-pasta accidents. What do you think?
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Thu, Jun 04, 2020 at 01:41:22PM +0200, Frederic Weisbecker wrote: > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > Reviewed-by: Joel Fernandes (Google) > > > > Thank you for looking this over, Joel! > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > right thing", even when things are changing? If it is feasible, that > > would prevent any number of "interesting" copy-pasta and "just now became > > common code" bugs down the road. > > This won't be pretty: > > locked = rcu_nocb_lock(); > rcu_nocb_unlock(locked); I was thinking in terms of a bit in the rcu_data structure into which rcu_nocb_lock() and friends stored the status, and from which rcu_nocb_unlock() and friends retrieved that same status. Sort of like how preemptible RCU uses the ->rcu_read_lock_nesting field in task_struct. As noted, this does require reworking the hotplug code to avoid the current holding of two such locks concurrently, which I am happy to do if that helps. Or am I missing a subtle (or not-so-subtle) twist here? > And anyway we still want to unconditionally lock on many places, > regardless of the offloaded state. I don't know how we could have > a magic helper do the unconditional lock on some places and the > conditional on others. I was assuming (perhaps incorrectly) that an intermediate phase between not-offloaded and offloaded would take care of all of those cases. > Also the point of turning the lock helpers into primitives is to make > it clearer as to where we really need unconditional locking and where > we allow it to be conditional. A gift to reviewers :-) Unless and until someone does a copy-pasta, thus unconditionally doing the wrong thing. ;-) If we cannot avoid different spellings of ->cblist in different places, such is life, but I do want to make sure that we have fully considered the alternatives. Thanx, Paul
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > Reviewed-by: Joel Fernandes (Google) > > Thank you for looking this over, Joel! > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > right thing", even when things are changing? If it is feasible, that > would prevent any number of "interesting" copy-pasta and "just now became > common code" bugs down the road. This won't be pretty: locked = rcu_nocb_lock(); rcu_nocb_unlock(locked); And anyway we still want to unconditionally lock on many places, regardless of the offloaded state. I don't know how we could have a magic helper do the unconditional lock on some places and the conditional on others. Also the point of turning the lock helpers into primitives is to make it clearer as to where we really need unconditional locking and where we allow it to be conditional. A gift to reviewers :-) Thanks.
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 08:45:42PM -0400, Joel Fernandes wrote: > Hi Paul, > > On Tue, May 26, 2020 at 6:29 PM Paul E. McKenney wrote: > > > > On Tue, May 26, 2020 at 05:27:56PM -0400, Joel Fernandes wrote: > > > On Tue, May 26, 2020 at 02:09:47PM -0700, Paul E. McKenney wrote: > > > [...] > > > > > > > BTW, I'm really itching to give it a try to make the scheduler > > > > > > > more deadlock > > > > > > > resilient (that is, if the scheduler wake up path detects a > > > > > > > deadlock, then it > > > > > > > defers the wake up using timers, or irq_work on its own instead > > > > > > > of passing > > > > > > > the burden of doing so to the callers). Thoughts? > > > > > > > > > > > > I have used similar approaches within RCU, but on the other hand the > > > > > > scheduler often has tighter latency constraints than RCU does. So I > > > > > > think that is a better question for the scheduler maintainers than > > > > > > it > > > > > > is for me. ;-) > > > > > > > > > > Ok, it definitely keeps coming up in my radar first with the > > > > > rcu_read_unlock_special() stuff, and now the nocb ;-). Perhaps it > > > > > could also > > > > > be good for a conference discussion! > > > > > > > > Again, please understand that RCU has way looser latency constraints > > > > than the scheduler does. Adding half a jiffy to wakeup latency might > > > > not go over well, especially in the real-time application area. > > > > > > Yeah, agreed that the "deadlock detection" code should be pretty light > > > weight > > > if/when it is written. > > > > In addition, to even stand a chance, you would need to use hrtimers. > > The half-jiffy (at a minimum) delay from any other deferral mechanism > > that I know of would be the kiss of death, especially from the viewpoint > > of the real-time guys. > > Just to make sure we are talking about the same kind of overhead - the > deferring is only needed if the rq lock is already held (detected by > trylocking). So there's no overhead in the common case other than the > trylock possibly being slightly more expensive than the regular > locking. Also, once the scheduler defers it, it uses the same kind of > mechanism that other deferral mechanisms use to overcome this deadlock > (timers, irq_work etc), so the overhead then would be no different > than what he have now - the RT users would already have the wake up > latency in current kernels without this idea implemented. Did I miss > something? Aggressive real-time applications care deeply about the uncommon case. Thanx, Paul > > > > But what did the scheduler maintainers say about this idea? > > > > > > Last I remember when it came up during the rcu_read_unlock_special() > > > deadlock > > > discussions, there's no way to know for infra like RCU to know that it was > > > invoked from the scheduler. > > > > > > The idea I am bringing up now (about the scheduler itself detecting a > > > recursion) was never brought up (not yet) with the sched maintainers (at > > > least not by me). > > > > It might be good to bounce if off of them sooner rather than later. > > Ok, I did that now over IRC. Thank you! > > - Joel
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
Hi Paul, On Tue, May 26, 2020 at 6:29 PM Paul E. McKenney wrote: > > On Tue, May 26, 2020 at 05:27:56PM -0400, Joel Fernandes wrote: > > On Tue, May 26, 2020 at 02:09:47PM -0700, Paul E. McKenney wrote: > > [...] > > > > > > BTW, I'm really itching to give it a try to make the scheduler more > > > > > > deadlock > > > > > > resilient (that is, if the scheduler wake up path detects a > > > > > > deadlock, then it > > > > > > defers the wake up using timers, or irq_work on its own instead of > > > > > > passing > > > > > > the burden of doing so to the callers). Thoughts? > > > > > > > > > > I have used similar approaches within RCU, but on the other hand the > > > > > scheduler often has tighter latency constraints than RCU does. So I > > > > > think that is a better question for the scheduler maintainers than it > > > > > is for me. ;-) > > > > > > > > Ok, it definitely keeps coming up in my radar first with the > > > > rcu_read_unlock_special() stuff, and now the nocb ;-). Perhaps it could > > > > also > > > > be good for a conference discussion! > > > > > > Again, please understand that RCU has way looser latency constraints > > > than the scheduler does. Adding half a jiffy to wakeup latency might > > > not go over well, especially in the real-time application area. > > > > Yeah, agreed that the "deadlock detection" code should be pretty light > > weight > > if/when it is written. > > In addition, to even stand a chance, you would need to use hrtimers. > The half-jiffy (at a minimum) delay from any other deferral mechanism > that I know of would be the kiss of death, especially from the viewpoint > of the real-time guys. Just to make sure we are talking about the same kind of overhead - the deferring is only needed if the rq lock is already held (detected by trylocking). So there's no overhead in the common case other than the trylock possibly being slightly more expensive than the regular locking. Also, once the scheduler defers it, it uses the same kind of mechanism that other deferral mechanisms use to overcome this deadlock (timers, irq_work etc), so the overhead then would be no different than what he have now - the RT users would already have the wake up latency in current kernels without this idea implemented. Did I miss something? > > > But what did the scheduler maintainers say about this idea? > > > > Last I remember when it came up during the rcu_read_unlock_special() > > deadlock > > discussions, there's no way to know for infra like RCU to know that it was > > invoked from the scheduler. > > > > The idea I am bringing up now (about the scheduler itself detecting a > > recursion) was never brought up (not yet) with the sched maintainers (at > > least not by me). > > It might be good to bounce if off of them sooner rather than later. Ok, I did that now over IRC. Thank you! - Joel
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 05:27:56PM -0400, Joel Fernandes wrote: > On Tue, May 26, 2020 at 02:09:47PM -0700, Paul E. McKenney wrote: > [...] > > > > > BTW, I'm really itching to give it a try to make the scheduler more > > > > > deadlock > > > > > resilient (that is, if the scheduler wake up path detects a deadlock, > > > > > then it > > > > > defers the wake up using timers, or irq_work on its own instead of > > > > > passing > > > > > the burden of doing so to the callers). Thoughts? > > > > > > > > I have used similar approaches within RCU, but on the other hand the > > > > scheduler often has tighter latency constraints than RCU does. So I > > > > think that is a better question for the scheduler maintainers than it > > > > is for me. ;-) > > > > > > Ok, it definitely keeps coming up in my radar first with the > > > rcu_read_unlock_special() stuff, and now the nocb ;-). Perhaps it could > > > also > > > be good for a conference discussion! > > > > Again, please understand that RCU has way looser latency constraints > > than the scheduler does. Adding half a jiffy to wakeup latency might > > not go over well, especially in the real-time application area. > > Yeah, agreed that the "deadlock detection" code should be pretty light weight > if/when it is written. In addition, to even stand a chance, you would need to use hrtimers. The half-jiffy (at a minimum) delay from any other deferral mechanism that I know of would be the kiss of death, especially from the viewpoint of the real-time guys. > > But what did the scheduler maintainers say about this idea? > > Last I remember when it came up during the rcu_read_unlock_special() deadlock > discussions, there's no way to know for infra like RCU to know that it was > invoked from the scheduler. > > The idea I am bringing up now (about the scheduler itself detecting a > recursion) was never brought up (not yet) with the sched maintainers (at > least not by me). It might be good to bounce if off of them sooner rather than later. Thanx, Paul
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 02:09:47PM -0700, Paul E. McKenney wrote: [...] > > > > BTW, I'm really itching to give it a try to make the scheduler more > > > > deadlock > > > > resilient (that is, if the scheduler wake up path detects a deadlock, > > > > then it > > > > defers the wake up using timers, or irq_work on its own instead of > > > > passing > > > > the burden of doing so to the callers). Thoughts? > > > > > > I have used similar approaches within RCU, but on the other hand the > > > scheduler often has tighter latency constraints than RCU does.So I > > > think that is a better question for the scheduler maintainers than it > > > is for me. ;-) > > > > Ok, it definitely keeps coming up in my radar first with the > > rcu_read_unlock_special() stuff, and now the nocb ;-). Perhaps it could also > > be good for a conference discussion! > > Again, please understand that RCU has way looser latency constraints > than the scheduler does. Adding half a jiffy to wakeup latency might > not go over well, especially in the real-time application area. Yeah, agreed that the "deadlock detection" code should be pretty light weight if/when it is written. > But what did the scheduler maintainers say about this idea? Last I remember when it came up during the rcu_read_unlock_special() deadlock discussions, there's no way to know for infra like RCU to know that it was invoked from the scheduler. The idea I am bringing up now (about the scheduler itself detecting a recursion) was never brought up (not yet) with the sched maintainers (at least not by me). thanks, - Joel
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 04:18:40PM -0400, Joel Fernandes wrote: > On Tue, May 26, 2020 at 09:29:46AM -0700, Paul E. McKenney wrote: > > On Tue, May 26, 2020 at 11:21:37AM -0400, Joel Fernandes wrote: > > > Hi Paul, > > > > > > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > > > On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > > > > > > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > > > > timers) can unconditionally lock rdp->nocb_lock as they always > > > > > > execute > > > > > > in the context of an offloaded rdp. > > > > > > > > > > > > This also prepare for toggling CPUs to/from callback's offloaded > > > > > > mode > > > > > > where the offloaded state will possibly change when rdp->nocb_lock > > > > > > isn't taken. We'll still want the entrypoints to lock the rdp in any > > > > > > case. > > > > > > > > > > Suggested rewrite for change log: > > > > > > > > > > Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, > > > > > nocb > > > > > timers) unconditionally lock rdp->nocb_lock as they always execute in > > > > > the > > > > > context of an offloaded rdp. > > > > > > > > > > This prepares for future toggling of CPUs to/from callback's > > > > > offloaded mode > > > > > where the offloaded state can change when rdp->nocb_lock is not held. > > > > > We'll > > > > > still want the entrypoints to lock the rdp in any case. > > > > > > > > > > > > > > > Also, can we inline rcu_nocb_lock_irqsave() into > > > > > do_nocb_deferred_wakeup_common() since that's the only user, and then > > > > > delete > > > > > rcu_nocb_lock_irqsave() and the corresponding unlock? That would also > > > > > remove > > > > > confusion about which API to use for nocb locking (i.e. whether to > > > > > directly > > > > > acquire lock or call rcu_nocb_lock_irqsave()). > > > > > > > > > > Reviewed-by: Joel Fernandes (Google) > > > > > > > > Thank you for looking this over, Joel! > > > > > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > > > right thing", even when things are changing? > > > > > > One way to prevent things from changing could be to employ Steven's > > > poor-man's RCU to mediate the "changing state" itself assuming the switch > > > from the fast-path to the slow-path does not happen often. :) So whichever > > > path is changing the state needs to wait for that poor-man's GP. > > > > That should not be needed, given that acquiring ->nocb_lock on the CPU > > corresponding to that lock suffices in both cases. The trick is making > > sure that the release matches the acquire. > > Revisiting what you meant by "when things are changing", I'm assuming you > meant a CPU is dynamically switched from non-offloaded mode to > offloaded-mode. Please correct me if I'm wrong. Both. It does no good to acquire by disabling IRQs and then release by releasing a lock that you do not hold. Nor does it help to acquire the lock and then fail to release it, instead merely re-enabling IRQs. Aside from the case where two locks are held, a per-CPU variable (as in another field in the rcu_data structure would work. And that case could probably be reworked to hold only one lock at a time. > Assuming that's true, you asked how do we "do the right thing" in the > lock/unlock APIs. I was also suggesting getting rid of them and directly > acquiring/releasing the spinlock, like Frederic does. It sounds like > that is not good enough and you want an API that can do conditional locking > (and the said complexity is hidden behind the API). Allow me to read more > code and see if I can understand that / how to do that. Indeed, the non-nohz_full code should not acquire the spinlock. > > > Any case, are you concerned with the performance issues with the > > > unconditional locking and that's why you suggest still keeping it > > > conditional? > > > > My concerns are more about maintainability. > > Ok. > > > > Also, coming back to my point of inline the helper into the last caller - > > > do_nocb_deferred_wakeup_common(). I think since we don't need to check if > > > the > > > rdp is offloaded and do any conditional locking. The timer can be called > > > only > > > with offloaded rdp. So we can directly do the unconditional spinlock > > > instead > > > of using the rcu_nocb_lock helper there. > > > > Indeed we can. But should we? > > Yeah may be not, in the event that we could do conditional locking and > benefit. Underneath an API, as current mainline does. > > > > would prevent any number of "interesting" copy-pasta and "just now > > > > became > > > > common code" bugs down the road. And because irqs are disabled while > > > > holding the lock, it should be possible to keep state on a per-CPU > > > > basis. > > > > > > Agreed, that would be nice. Though if we could keep simple, that'd be nice > > > too. > > > >
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 09:29:46AM -0700, Paul E. McKenney wrote: > On Tue, May 26, 2020 at 11:21:37AM -0400, Joel Fernandes wrote: > > Hi Paul, > > > > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > > On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > > > > > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > > > timers) can unconditionally lock rdp->nocb_lock as they always execute > > > > > in the context of an offloaded rdp. > > > > > > > > > > This also prepare for toggling CPUs to/from callback's offloaded mode > > > > > where the offloaded state will possibly change when rdp->nocb_lock > > > > > isn't taken. We'll still want the entrypoints to lock the rdp in any > > > > > case. > > > > > > > > Suggested rewrite for change log: > > > > > > > > Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > > timers) unconditionally lock rdp->nocb_lock as they always execute in > > > > the > > > > context of an offloaded rdp. > > > > > > > > This prepares for future toggling of CPUs to/from callback's offloaded > > > > mode > > > > where the offloaded state can change when rdp->nocb_lock is not held. > > > > We'll > > > > still want the entrypoints to lock the rdp in any case. > > > > > > > > > > > > Also, can we inline rcu_nocb_lock_irqsave() into > > > > do_nocb_deferred_wakeup_common() since that's the only user, and then > > > > delete > > > > rcu_nocb_lock_irqsave() and the corresponding unlock? That would also > > > > remove > > > > confusion about which API to use for nocb locking (i.e. whether to > > > > directly > > > > acquire lock or call rcu_nocb_lock_irqsave()). > > > > > > > > Reviewed-by: Joel Fernandes (Google) > > > > > > Thank you for looking this over, Joel! > > > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > > right thing", even when things are changing? > > > > One way to prevent things from changing could be to employ Steven's > > poor-man's RCU to mediate the "changing state" itself assuming the switch > > from the fast-path to the slow-path does not happen often. :) So whichever > > path is changing the state needs to wait for that poor-man's GP. > > That should not be needed, given that acquiring ->nocb_lock on the CPU > corresponding to that lock suffices in both cases. The trick is making > sure that the release matches the acquire. Revisiting what you meant by "when things are changing", I'm assuming you meant a CPU is dynamically switched from non-offloaded mode to offloaded-mode. Please correct me if I'm wrong. Assuming that's true, you asked how do we "do the right thing" in the lock/unlock APIs. I was also suggesting getting rid of them and directly acquiring/releasing the spinlock, like Frederic does. It sounds like that is not good enough and you want an API that can do conditional locking (and the said complexity is hidden behind the API). Allow me to read more code and see if I can understand that / how to do that. > > Any case, are you concerned with the performance issues with the > > unconditional locking and that's why you suggest still keeping it > > conditional? > > My concerns are more about maintainability. Ok. > > Also, coming back to my point of inline the helper into the last caller - > > do_nocb_deferred_wakeup_common(). I think since we don't need to check if > > the > > rdp is offloaded and do any conditional locking. The timer can be called > > only > > with offloaded rdp. So we can directly do the unconditional spinlock instead > > of using the rcu_nocb_lock helper there. > > Indeed we can. But should we? Yeah may be not, in the event that we could do conditional locking and benefit. > > > would prevent any number of "interesting" copy-pasta and "just now became > > > common code" bugs down the road. And because irqs are disabled while > > > holding the lock, it should be possible to keep state on a per-CPU basis. > > > > Agreed, that would be nice. Though if we could keep simple, that'd be nice > > too. > > Having one set of functions/macros that are always used to protect > the ->cblist, no matter what the context, is a very attractive form of > simple. ;-) I was thinking that API is already raw_spin_lock/unlock() but I'll revisit everything. > > > The ugliest scenario is callback adoption, where there are two ->cblist > > > structures in need of being locked. In that case, changes are excluded > > > (because that is in CPU hotplug code), but is it possible to take > > > advantage of that reasonably? > > > > Could you describe this a bit more? Thanks. > > Right now, callbacks are merged directly from the outgoing CPU's ->cblist > to the surviving CPU's ->cblist. This means that both ->cblist structures > must be locked at the same time, which would require additional state. > After all, if only the one ->cblist were to be
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Tue, May 26, 2020 at 11:21:37AM -0400, Joel Fernandes wrote: > Hi Paul, > > On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > > On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > > > > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > > timers) can unconditionally lock rdp->nocb_lock as they always execute > > > > in the context of an offloaded rdp. > > > > > > > > This also prepare for toggling CPUs to/from callback's offloaded mode > > > > where the offloaded state will possibly change when rdp->nocb_lock > > > > isn't taken. We'll still want the entrypoints to lock the rdp in any > > > > case. > > > > > > Suggested rewrite for change log: > > > > > > Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > timers) unconditionally lock rdp->nocb_lock as they always execute in the > > > context of an offloaded rdp. > > > > > > This prepares for future toggling of CPUs to/from callback's offloaded > > > mode > > > where the offloaded state can change when rdp->nocb_lock is not held. > > > We'll > > > still want the entrypoints to lock the rdp in any case. > > > > > > > > > Also, can we inline rcu_nocb_lock_irqsave() into > > > do_nocb_deferred_wakeup_common() since that's the only user, and then > > > delete > > > rcu_nocb_lock_irqsave() and the corresponding unlock? That would also > > > remove > > > confusion about which API to use for nocb locking (i.e. whether to > > > directly > > > acquire lock or call rcu_nocb_lock_irqsave()). > > > > > > Reviewed-by: Joel Fernandes (Google) > > > > Thank you for looking this over, Joel! > > > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > > right thing", even when things are changing? > > One way to prevent things from changing could be to employ Steven's > poor-man's RCU to mediate the "changing state" itself assuming the switch > from the fast-path to the slow-path does not happen often. :) So whichever > path is changing the state needs to wait for that poor-man's GP. That should not be needed, given that acquiring ->nocb_lock on the CPU corresponding to that lock suffices in both cases. The trick is making sure that the release matches the acquire. > Any case, are you concerned with the performance issues with the > unconditional locking and that's why you suggest still keeping it conditional? My concerns are more about maintainability. > Also, coming back to my point of inline the helper into the last caller - > do_nocb_deferred_wakeup_common(). I think since we don't need to check if the > rdp is offloaded and do any conditional locking. The timer can be called only > with offloaded rdp. So we can directly do the unconditional spinlock instead > of using the rcu_nocb_lock helper there. Indeed we can. But should we? > > would prevent any number of "interesting" copy-pasta and "just now became > > common code" bugs down the road. And because irqs are disabled while > > holding the lock, it should be possible to keep state on a per-CPU basis. > > Agreed, that would be nice. Though if we could keep simple, that'd be nice > too. Having one set of functions/macros that are always used to protect the ->cblist, no matter what the context, is a very attractive form of simple. ;-) > > The ugliest scenario is callback adoption, where there are two ->cblist > > structures in need of being locked. In that case, changes are excluded > > (because that is in CPU hotplug code), but is it possible to take > > advantage of that reasonably? > > Could you describe this a bit more? Thanks. Right now, callbacks are merged directly from the outgoing CPU's ->cblist to the surviving CPU's ->cblist. This means that both ->cblist structures must be locked at the same time, which would require additional state. After all, if only the one ->cblist were to be locked at a given time, a per-CPU variable could be used to track what method should be used to unlock the ->cblist. This could be restructured to use an intermediate private ->cblist, but care would be required with the counts, as it is a very bad idea for a large group of callbacks to become invisible. (This is not a problem for rcu_barrier(), which excludes CPU hotplug, but it could be a serious problem for callback-flood-mitigation mechanisms. Yes, they are heuristic, but...) > > Maybe these changes are the best we can do, but it would be good to > > if the same primitive locked a ->cblist regardless of context. > > Here you are comparing 2 primitives. Do you mean that just IRQs being > disabled is another primitive, and rcu_nocb_lock is another one? I am not sure what this question means, but I am advocating looking into retaining a single wrapper that decides instead of direct use of the underlying primitives. Or are you instead asking why there are two different methods of protecting the ->cblist
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
Hi Paul, On Fri, May 22, 2020 at 10:57:39AM -0700, Paul E. McKenney wrote: > On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > > On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > > > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > > timers) can unconditionally lock rdp->nocb_lock as they always execute > > > in the context of an offloaded rdp. > > > > > > This also prepare for toggling CPUs to/from callback's offloaded mode > > > where the offloaded state will possibly change when rdp->nocb_lock > > > isn't taken. We'll still want the entrypoints to lock the rdp in any > > > case. > > > > Suggested rewrite for change log: > > > > Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > timers) unconditionally lock rdp->nocb_lock as they always execute in the > > context of an offloaded rdp. > > > > This prepares for future toggling of CPUs to/from callback's offloaded mode > > where the offloaded state can change when rdp->nocb_lock is not held. We'll > > still want the entrypoints to lock the rdp in any case. > > > > > > Also, can we inline rcu_nocb_lock_irqsave() into > > do_nocb_deferred_wakeup_common() since that's the only user, and then delete > > rcu_nocb_lock_irqsave() and the corresponding unlock? That would also remove > > confusion about which API to use for nocb locking (i.e. whether to directly > > acquire lock or call rcu_nocb_lock_irqsave()). > > > > Reviewed-by: Joel Fernandes (Google) > > Thank you for looking this over, Joel! > > Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the > right thing", even when things are changing? One way to prevent things from changing could be to employ Steven's poor-man's RCU to mediate the "changing state" itself assuming the switch from the fast-path to the slow-path does not happen often. :) So whichever path is changing the state needs to wait for that poor-man's GP. Any case, are you concerned with the performance issues with the unconditional locking and that's why you suggest still keeping it conditional? Also, coming back to my point of inline the helper into the last caller - do_nocb_deferred_wakeup_common(). I think since we don't need to check if the rdp is offloaded and do any conditional locking. The timer can be called only with offloaded rdp. So we can directly do the unconditional spinlock instead of using the rcu_nocb_lock helper there. > would prevent any number of "interesting" copy-pasta and "just now became > common code" bugs down the road. And because irqs are disabled while > holding the lock, it should be possible to keep state on a per-CPU basis. Agreed, that would be nice. Though if we could keep simple, that'd be nice too. > The ugliest scenario is callback adoption, where there are two ->cblist > structures in need of being locked. In that case, changes are excluded > (because that is in CPU hotplug code), but is it possible to take > advantage of that reasonably? Could you describe this a bit more? Thanks. > Maybe these changes are the best we can do, but it would be good to > if the same primitive locked a ->cblist regardless of context. Here you are comparing 2 primitives. Do you mean that just IRQs being disabled is another primitive, and rcu_nocb_lock is another one? BTW, I'm really itching to give it a try to make the scheduler more deadlock resilient (that is, if the scheduler wake up path detects a deadlock, then it defers the wake up using timers, or irq_work on its own instead of passing the burden of doing so to the callers). Thoughts? thanks, - Joel > Can that be made to work reasonably? > > Thanx, Paul > > > thanks, > > > > - Joel > > > > > > > > > > Signed-off-by: Frederic Weisbecker > > > Cc: Paul E. McKenney > > > Cc: Josh Triplett > > > Cc: Steven Rostedt > > > Cc: Mathieu Desnoyers > > > Cc: Lai Jiangshan > > > Cc: Joel Fernandes > > > --- > > > kernel/rcu/tree_plugin.h | 14 +++--- > > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > > index 097635c41135..523570469864 100644 > > > --- a/kernel/rcu/tree_plugin.h > > > +++ b/kernel/rcu/tree_plugin.h > > > @@ -1909,7 +1909,7 @@ static void do_nocb_bypass_wakeup_timer(struct > > > timer_list *t) > > > struct rcu_data *rdp = from_timer(rdp, t, nocb_bypass_timer); > > > > > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer")); > > > - rcu_nocb_lock_irqsave(rdp, flags); > > > + raw_spin_lock_irqsave(>nocb_lock, flags); > > > smp_mb__after_spinlock(); /* Timer expire before wakeup. */ > > > __call_rcu_nocb_wake(rdp, true, flags); > > > } > > > @@ -1942,7 +1942,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > > >*/ > > > for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) { > > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check")); >
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Wed, May 20, 2020 at 08:29:49AM -0400, Joel Fernandes wrote: > On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > > timers) can unconditionally lock rdp->nocb_lock as they always execute > > in the context of an offloaded rdp. > > > > This also prepare for toggling CPUs to/from callback's offloaded mode > > where the offloaded state will possibly change when rdp->nocb_lock > > isn't taken. We'll still want the entrypoints to lock the rdp in any > > case. > > Suggested rewrite for change log: > > Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > timers) unconditionally lock rdp->nocb_lock as they always execute in the > context of an offloaded rdp. > > This prepares for future toggling of CPUs to/from callback's offloaded mode > where the offloaded state can change when rdp->nocb_lock is not held. We'll > still want the entrypoints to lock the rdp in any case. > > > Also, can we inline rcu_nocb_lock_irqsave() into > do_nocb_deferred_wakeup_common() since that's the only user, and then delete > rcu_nocb_lock_irqsave() and the corresponding unlock? That would also remove > confusion about which API to use for nocb locking (i.e. whether to directly > acquire lock or call rcu_nocb_lock_irqsave()). > > Reviewed-by: Joel Fernandes (Google) Thank you for looking this over, Joel! Is it feasible to make rcu_nocb_lock*() and rcu_nocb_unlock*() "do the right thing", even when things are changing? If it is feasible, that would prevent any number of "interesting" copy-pasta and "just now became common code" bugs down the road. And because irqs are disabled while holding the lock, it should be possible to keep state on a per-CPU basis. The ugliest scenario is callback adoption, where there are two ->cblist structures in need of being locked. In that case, changes are excluded (because that is in CPU hotplug code), but is it possible to take advantage of that reasonably? Maybe these changes are the best we can do, but it would be good to if the same primitive locked a ->cblist regardless of context. Can that be made to work reasonably? Thanx, Paul > thanks, > > - Joel > > > > > > Signed-off-by: Frederic Weisbecker > > Cc: Paul E. McKenney > > Cc: Josh Triplett > > Cc: Steven Rostedt > > Cc: Mathieu Desnoyers > > Cc: Lai Jiangshan > > Cc: Joel Fernandes > > --- > > kernel/rcu/tree_plugin.h | 14 +++--- > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > index 097635c41135..523570469864 100644 > > --- a/kernel/rcu/tree_plugin.h > > +++ b/kernel/rcu/tree_plugin.h > > @@ -1909,7 +1909,7 @@ static void do_nocb_bypass_wakeup_timer(struct > > timer_list *t) > > struct rcu_data *rdp = from_timer(rdp, t, nocb_bypass_timer); > > > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer")); > > - rcu_nocb_lock_irqsave(rdp, flags); > > + raw_spin_lock_irqsave(>nocb_lock, flags); > > smp_mb__after_spinlock(); /* Timer expire before wakeup. */ > > __call_rcu_nocb_wake(rdp, true, flags); > > } > > @@ -1942,7 +1942,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > > */ > > for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) { > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check")); > > - rcu_nocb_lock_irqsave(rdp, flags); > > + raw_spin_lock_irqsave(>nocb_lock, flags); > > bypass_ncbs = rcu_cblist_n_cbs(>nocb_bypass); > > if (bypass_ncbs && > > (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) || > > @@ -1951,7 +1951,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > > (void)rcu_nocb_try_flush_bypass(rdp, j); > > bypass_ncbs = rcu_cblist_n_cbs(>nocb_bypass); > > } else if (!bypass_ncbs && rcu_segcblist_empty(>cblist)) { > > - rcu_nocb_unlock_irqrestore(rdp, flags); > > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > > continue; /* No callbacks here, try next. */ > > } > > if (bypass_ncbs) { > > @@ -1996,7 +1996,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > > } else { > > needwake = false; > > } > > - rcu_nocb_unlock_irqrestore(rdp, flags); > > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > > if (needwake) { > > swake_up_one(>nocb_cb_wq); > > gotcbs = true; > > @@ -2084,7 +2084,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) > > rcu_do_batch(rdp); > > local_bh_enable(); > > lockdep_assert_irqs_enabled(); > > - rcu_nocb_lock_irqsave(rdp, flags); > > + raw_spin_lock_irqsave(>nocb_lock, flags); > > if (rcu_segcblist_nextgp(>cblist,
Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code entrypoints
On Wed, May 13, 2020 at 06:47:05PM +0200, Frederic Weisbecker wrote: > Pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb > timers) can unconditionally lock rdp->nocb_lock as they always execute > in the context of an offloaded rdp. > > This also prepare for toggling CPUs to/from callback's offloaded mode > where the offloaded state will possibly change when rdp->nocb_lock > isn't taken. We'll still want the entrypoints to lock the rdp in any > case. Suggested rewrite for change log: Make pure NOCB code entrypoints (nocb_cb kthread, nocb_gp kthread, nocb timers) unconditionally lock rdp->nocb_lock as they always execute in the context of an offloaded rdp. This prepares for future toggling of CPUs to/from callback's offloaded mode where the offloaded state can change when rdp->nocb_lock is not held. We'll still want the entrypoints to lock the rdp in any case. Also, can we inline rcu_nocb_lock_irqsave() into do_nocb_deferred_wakeup_common() since that's the only user, and then delete rcu_nocb_lock_irqsave() and the corresponding unlock? That would also remove confusion about which API to use for nocb locking (i.e. whether to directly acquire lock or call rcu_nocb_lock_irqsave()). Reviewed-by: Joel Fernandes (Google) thanks, - Joel > > Signed-off-by: Frederic Weisbecker > Cc: Paul E. McKenney > Cc: Josh Triplett > Cc: Steven Rostedt > Cc: Mathieu Desnoyers > Cc: Lai Jiangshan > Cc: Joel Fernandes > --- > kernel/rcu/tree_plugin.h | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index 097635c41135..523570469864 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -1909,7 +1909,7 @@ static void do_nocb_bypass_wakeup_timer(struct > timer_list *t) > struct rcu_data *rdp = from_timer(rdp, t, nocb_bypass_timer); > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer")); > - rcu_nocb_lock_irqsave(rdp, flags); > + raw_spin_lock_irqsave(>nocb_lock, flags); > smp_mb__after_spinlock(); /* Timer expire before wakeup. */ > __call_rcu_nocb_wake(rdp, true, flags); > } > @@ -1942,7 +1942,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) >*/ > for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) { > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check")); > - rcu_nocb_lock_irqsave(rdp, flags); > + raw_spin_lock_irqsave(>nocb_lock, flags); > bypass_ncbs = rcu_cblist_n_cbs(>nocb_bypass); > if (bypass_ncbs && > (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) || > @@ -1951,7 +1951,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > (void)rcu_nocb_try_flush_bypass(rdp, j); > bypass_ncbs = rcu_cblist_n_cbs(>nocb_bypass); > } else if (!bypass_ncbs && rcu_segcblist_empty(>cblist)) { > - rcu_nocb_unlock_irqrestore(rdp, flags); > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > continue; /* No callbacks here, try next. */ > } > if (bypass_ncbs) { > @@ -1996,7 +1996,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp) > } else { > needwake = false; > } > - rcu_nocb_unlock_irqrestore(rdp, flags); > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > if (needwake) { > swake_up_one(>nocb_cb_wq); > gotcbs = true; > @@ -2084,7 +2084,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) > rcu_do_batch(rdp); > local_bh_enable(); > lockdep_assert_irqs_enabled(); > - rcu_nocb_lock_irqsave(rdp, flags); > + raw_spin_lock_irqsave(>nocb_lock, flags); > if (rcu_segcblist_nextgp(>cblist, _gp_seq) && > rcu_seq_done(>gp_seq, cur_gp_seq) && > raw_spin_trylock_rcu_node(rnp)) { /* irqs already disabled. */ > @@ -2092,7 +2092,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) > raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ > } > if (rcu_segcblist_ready_cbs(>cblist)) { > - rcu_nocb_unlock_irqrestore(rdp, flags); > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > if (needwake_gp) > rcu_gp_kthread_wake(); > return; > @@ -2100,7 +2100,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) > > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep")); > WRITE_ONCE(rdp->nocb_cb_sleep, true); > - rcu_nocb_unlock_irqrestore(rdp, flags); > + raw_spin_unlock_irqrestore(>nocb_lock, flags); > if (needwake_gp) > rcu_gp_kthread_wake(); > swait_event_interruptible_exclusive(rdp->nocb_cb_wq, > -- > 2.25.0 >