Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-09-06 Thread Paul E. McKenney
On Tue, Aug 22, 2017 at 08:26:37AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> > On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:

[ . . . ]

> > No more RCU stalls on PowerPC, system is clean when idle or with some
> > test runs.
> > 
> > Thank you all for your time and efforts in fixing this.
> > 
> > Reported-and-Tested-by: Abdul Haleem 
> 
> I am still seeing failures, but then again I am running rcutorture with
> lots of CPU hotplug activity.  So I am probably seeing some other bug,
> though it still looks a lot like a lost timer.

So one problem appears to be a timing-related deadlock between RCU and
timers.  The way that this can happen is that the outgoing CPU goes
offline (as in cpuhp_report_idle_dead() invoked from do_idle()) with
one of RCU's grace-period kthread's timers queued.  Now, if someone
waits for a grace period, either directly or indirectly, in a way that
blocks the hotplug notifiers, execution will never reach timers_dead_cpu(),
which means that the grace-period kthread will never wake, which will
mean that the grace period will never complete.  Classic deadlock.

I currently have an extremely ugly workaround for this deadlock, which
is to periodically and (usually) redundantly wake up all the RCU
grace-period kthreads from the scheduling-interrupt handler.  This is
of course completely inappropriate for mainline, but it does reliably
prevent the "kthread starved for %ld jiffies!" type of RCU CPU stall
warning that I would otherwise see.

To mainline this, one approach would be to make the timers switch to
add_timer_on() to a surviving CPU once the offlining process starts.
Alternatively, I suppose that RCU could do the redundant-wakeup kludge,
but with checks to prevent it from happening unless (1) there is a CPU
in the process of going offline (2) there is an RCU grace period in
progress, and (3) the RCU grace period kthread has been blocked for
(say) three times longer than it should have.

Unfortunately, this is not sufficient to make rcutorture run reliably,
though it does help, which is of course to say that it makes debugging
slower.  ;-)

What happens now is that random rcutorture kthreads will hang waiting for
timeouts to complete.  This confused me for awhile because I expected
that the timeouts would be delayed during offline processing, but that
my crude deadlock-resolution approach would eventually get things going.
My current suspicion is that the problem is due to a potential delay
between the time an outgoing CPU hits cpuhp_report_idle_dead() and the
timers get migrated from timers_dead_cpu().  This means that the CPU
adopting the timers might be a few ticks ahead of where the outgoing CPU
last processed timers.  My current guess is that any timers queued in
intervening indexes are going to wait one good long time.  And I don't see
any code in the timers_dead_cpu() that would account for this possibility,
though I of course cannot claim to fully understand this code..

Is this plausible, or am I confused?  (Either way, -something- besides
just me is rather thoroughly confused!)

If this is plausible, my guess is that timers_dead_cpu() needs to check
for mismatched indexes (in timer->flags?) and force any intervening
timers to expire if so.

Thoughts?

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-22 Thread Paul E. McKenney
On Tue, Aug 22, 2017 at 02:21:32PM +0530, Abdul Haleem wrote:
> On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> > On Mon, 21 Aug 2017 13:55:04 -0700
> > David Miller  wrote:
> > 
> > > From: Nicholas Piggin 
> > > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > > 
> > > > Thanks here's an updated version with a couple more bugs fixed. If
> > > > you could try testing, that would be much appreciated.  
> > > 
> > > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > > 
> > > I'm really happy you guys were able to figure out what was going
> > > wrong. :-)
> > > 
> > > Feel free to add my Tested-by:
> > > 
> > 
> > Like wise - 16 hours of clean run with the latest
> > 
> > Tested-by: Jonathan Cameron 
> > 
> > Thanks for all the hard work everyone put into this one, great to
> > cross it off the list!
> > 
> > Jonathan
> > 
> 
> No more RCU stalls on PowerPC, system is clean when idle or with some
> test runs.
> 
> Thank you all for your time and efforts in fixing this.
> 
> Reported-and-Tested-by: Abdul Haleem 

I am still seeing failures, but then again I am running rcutorture with
lots of CPU hotplug activity.  So I am probably seeing some other bug,
though it still looks a lot like a lost timer.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-22 Thread Abdul Haleem
On Tue, 2017-08-22 at 08:49 +0100, Jonathan Cameron wrote:
> On Mon, 21 Aug 2017 13:55:04 -0700
> David Miller  wrote:
> 
> > From: Nicholas Piggin 
> > Date: Tue, 22 Aug 2017 00:19:28 +1000
> > 
> > > Thanks here's an updated version with a couple more bugs fixed. If
> > > you could try testing, that would be much appreciated.  
> > 
> > I'm not getting RCU stalls on sparc64 any longer with this patch.
> > 
> > I'm really happy you guys were able to figure out what was going
> > wrong. :-)
> > 
> > Feel free to add my Tested-by:
> > 
> 
> Like wise - 16 hours of clean run with the latest
> 
> Tested-by: Jonathan Cameron 
> 
> Thanks for all the hard work everyone put into this one, great to
> cross it off the list!
> 
> Jonathan
> 

No more RCU stalls on PowerPC, system is clean when idle or with some
test runs.

Thank you all for your time and efforts in fixing this.

Reported-and-Tested-by: Abdul Haleem 

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-22 Thread Jonathan Cameron
On Mon, 21 Aug 2017 13:55:04 -0700
David Miller  wrote:

> From: Nicholas Piggin 
> Date: Tue, 22 Aug 2017 00:19:28 +1000
> 
> > Thanks here's an updated version with a couple more bugs fixed. If
> > you could try testing, that would be much appreciated.  
> 
> I'm not getting RCU stalls on sparc64 any longer with this patch.
> 
> I'm really happy you guys were able to figure out what was going
> wrong. :-)
> 
> Feel free to add my Tested-by:
> 

Like wise - 16 hours of clean run with the latest

Tested-by: Jonathan Cameron 

Thanks for all the hard work everyone put into this one, great to
cross it off the list!

Jonathan


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread Paul E. McKenney
On Mon, Aug 21, 2017 at 10:52:58AM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin  wrote:
> > > >   
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney"  wrote:  
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > > > >   
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to 
> > > > > do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many 
> > > > > seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with 
> > > > > del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.  
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).  
> > > 
> > > Giving this one a whirl!  
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Yes, along with rcutorture, so a very different workload.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread David Miller
From: Nicholas Piggin 
Date: Tue, 22 Aug 2017 00:19:28 +1000

> Thanks here's an updated version with a couple more bugs fixed. If
> you could try testing, that would be much appreciated.

I'm not getting RCU stalls on sparc64 any longer with this patch.

I'm really happy you guys were able to figure out what was going
wrong. :-)

Feel free to add my Tested-by:



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread Jonathan Cameron
On Tue, 22 Aug 2017 00:19:28 +1000
Nicholas Piggin  wrote:

> On Mon, 21 Aug 2017 11:18:33 +0100
> Jonathan Cameron  wrote:
> 
> > On Mon, 21 Aug 2017 16:06:05 +1000
> > Nicholas Piggin  wrote:
> >   
> > > On Mon, 21 Aug 2017 10:52:58 +1000
> > > Nicholas Piggin  wrote:
> > > 
> > > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > > "Paul E. McKenney"  wrote:
> > > >   
> > > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote: 
> > > > >
> > > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > > > > >   
> > > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > > Nicholas Piggin  wrote:
> > > > > > >   
> > > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > > "Paul E. McKenney"  wrote:  
> > > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney 
> > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > Thomas, John, am I misinterpreting the timer trace event 
> > > > > > > > > messages?
> > > > > > > > 
> > > > > > > > So I did some digging, and what you find is that rcu_sched 
> > > > > > > > seems to do a
> > > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many 
> > > > > > > > seconds.
> > > > > > > > The process_timeout timer never fires (when it finally does 
> > > > > > > > wake after
> > > > > > > > one of these events, it usually removes the timer with 
> > > > > > > > del_timer_sync).
> > > > > > > > 
> > > > > > > > So this patch seems to fix it. Testing, comments welcome.   
> > > > > > > >
> > > > > > > 
> > > > > > > Okay this had a problem of trying to forward the timer from a 
> > > > > > > timer
> > > > > > > callback function.
> > > > > > > 
> > > > > > > This was my other approach which also fixes the RCU warnings, but 
> > > > > > > it's
> > > > > > > a little more complex. I reworked it a bit so the mod_timer fast 
> > > > > > > path
> > > > > > > hopefully doesn't have much more overhead (actually by reading 
> > > > > > > jiffies
> > > > > > > only when needed, it probably saves a load).  
> > > > > > 
> > > > > > Giving this one a whirl!  
> > > > > 
> > > > > No joy here, but then again there are other reasons to believe that I
> > > > > am seeing a different bug than Dave and Jonathan are.
> > > > > 
> > > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > > not statistically different from what I see without either patch.
> > > > > 
> > > > > But no statistical difference compared to without patch, and I still
> > > > > see the "rcu_sched kthread starved" messages.  For whatever it is 
> > > > > worth,
> > > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > > be hypervisor preemption, I guess.
> > > > 
> > > > Okay it makes the warnings go away for me, but I'm just booting then
> > > > leaving the system idle. You're doing some CPU hotplug activity?  
> > > 
> > > Okay found a bug in the patch (it was not forwarding properly before
> > > adding the first timer after an idle) and a few other concerns.
> > > 
> > > There's still a problem of a timer function doing a mod timer from
> > > within expire_timers. It can't forward the base, which might currently
> > > be quite a way behind. I *think* after we close these gaps and get
> > > timely wakeups for timers on there, it should not get too far behind
> > > for standard timers.
> > > 
> > > Deferrable is a different story. Firstly it has no idle tracking so we
> > > never forward it. Even if we wanted to, we can't do it reliably because
> > > it could contain timers way behind the base. They are "deferrable", so
> > > you get what you pay for, but this still means there's a window where
> > > you can add a deferrable timer and get a far later expiry than you
> > > asked for despite the CPU never going idle after you added it.
> > > 
> > > All these problems would seem to go away if mod_timer just queued up
> > > the timer to a single list on the base then pushed them into the
> > > wheel during your wheel processing softirq... Although maybe you end
> > > up with excessive passes over big queue of timers. Anyway that
> > > wouldn't be suitable for 4.13 even if it could work.
> > > 
> > > I'll send out an updated minimal fix after some more testing...
> > 
> > Hi All,
> > 
> > I'm back in the office with hardware access on our D05 64 core ARM64
> > boards.
> > 
> > I think we still have by far the quickest test cases for this so
> > feel free to ping me anything you want tested quickly (we were
> > looking at an average of less than 10 minutes to trigger
> > with machine 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread Nicholas Piggin
On Mon, 21 Aug 2017 11:18:33 +0100
Jonathan Cameron  wrote:

> On Mon, 21 Aug 2017 16:06:05 +1000
> Nicholas Piggin  wrote:
> 
> > On Mon, 21 Aug 2017 10:52:58 +1000
> > Nicholas Piggin  wrote:
> >   
> > > On Sun, 20 Aug 2017 14:14:29 -0700
> > > "Paul E. McKenney"  wrote:
> > > 
> > > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:  
> > > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > >   
> > > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > > Nicholas Piggin  wrote:
> > > > > > 
> > > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > > "Paul E. McKenney"  wrote:
> > > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney 
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > Thomas, John, am I misinterpreting the timer trace event 
> > > > > > > > messages?  
> > > > > > > 
> > > > > > > So I did some digging, and what you find is that rcu_sched seems 
> > > > > > > to do a
> > > > > > > simple scheudle_timeout(1) and just goes out to lunch for many 
> > > > > > > seconds.
> > > > > > > The process_timeout timer never fires (when it finally does wake 
> > > > > > > after
> > > > > > > one of these events, it usually removes the timer with 
> > > > > > > del_timer_sync).
> > > > > > > 
> > > > > > > So this patch seems to fix it. Testing, comments welcome.
> > > > > > 
> > > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > > callback function.
> > > > > > 
> > > > > > This was my other approach which also fixes the RCU warnings, but 
> > > > > > it's
> > > > > > a little more complex. I reworked it a bit so the mod_timer fast 
> > > > > > path
> > > > > > hopefully doesn't have much more overhead (actually by reading 
> > > > > > jiffies
> > > > > > only when needed, it probably saves a load).
> > > > > 
> > > > > Giving this one a whirl!
> > > > 
> > > > No joy here, but then again there are other reasons to believe that I
> > > > am seeing a different bug than Dave and Jonathan are.
> > > > 
> > > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > > not statistically different from what I see without either patch.
> > > > 
> > > > But no statistical difference compared to without patch, and I still
> > > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > > be hypervisor preemption, I guess.  
> > > 
> > > Okay it makes the warnings go away for me, but I'm just booting then
> > > leaving the system idle. You're doing some CPU hotplug activity?
> > 
> > Okay found a bug in the patch (it was not forwarding properly before
> > adding the first timer after an idle) and a few other concerns.
> > 
> > There's still a problem of a timer function doing a mod timer from
> > within expire_timers. It can't forward the base, which might currently
> > be quite a way behind. I *think* after we close these gaps and get
> > timely wakeups for timers on there, it should not get too far behind
> > for standard timers.
> > 
> > Deferrable is a different story. Firstly it has no idle tracking so we
> > never forward it. Even if we wanted to, we can't do it reliably because
> > it could contain timers way behind the base. They are "deferrable", so
> > you get what you pay for, but this still means there's a window where
> > you can add a deferrable timer and get a far later expiry than you
> > asked for despite the CPU never going idle after you added it.
> > 
> > All these problems would seem to go away if mod_timer just queued up
> > the timer to a single list on the base then pushed them into the
> > wheel during your wheel processing softirq... Although maybe you end
> > up with excessive passes over big queue of timers. Anyway that
> > wouldn't be suitable for 4.13 even if it could work.
> > 
> > I'll send out an updated minimal fix after some more testing...  
> 
> Hi All,
> 
> I'm back in the office with hardware access on our D05 64 core ARM64
> boards.
> 
> I think we still have by far the quickest test cases for this so
> feel free to ping me anything you want tested quickly (we were
> looking at an average of less than 10 minutes to trigger
> with machine idling).
> 
> Nick, I'm currently running your previous version and we are over an
> hour so even without any instances of the issue so it looks like a
> considerable improvement.  I'll see if I can line a couple of boards
> up for an overnight run if you have your updated version out by then.
> 
> Be great to finally put this one to bed.

Hi Jonathan,

Thanks 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread Jonathan Cameron
On Mon, 21 Aug 2017 16:06:05 +1000
Nicholas Piggin  wrote:

> On Mon, 21 Aug 2017 10:52:58 +1000
> Nicholas Piggin  wrote:
> 
> > On Sun, 20 Aug 2017 14:14:29 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > > Nicholas Piggin  wrote:
> > > > >   
> > > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > > "Paul E. McKenney"  wrote:  
> > > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > > 
> > > > > > > Thomas, John, am I misinterpreting the timer trace event 
> > > > > > > messages?
> > > > > > 
> > > > > > So I did some digging, and what you find is that rcu_sched seems to 
> > > > > > do a
> > > > > > simple scheudle_timeout(1) and just goes out to lunch for many 
> > > > > > seconds.
> > > > > > The process_timeout timer never fires (when it finally does wake 
> > > > > > after
> > > > > > one of these events, it usually removes the timer with 
> > > > > > del_timer_sync).
> > > > > > 
> > > > > > So this patch seems to fix it. Testing, comments welcome.  
> > > > > 
> > > > > Okay this had a problem of trying to forward the timer from a timer
> > > > > callback function.
> > > > > 
> > > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > > only when needed, it probably saves a load).  
> > > > 
> > > > Giving this one a whirl!  
> > > 
> > > No joy here, but then again there are other reasons to believe that I
> > > am seeing a different bug than Dave and Jonathan are.
> > > 
> > > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > > not statistically different from what I see without either patch.
> > > 
> > > But no statistical difference compared to without patch, and I still
> > > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > > Hmmm...  I am also seeing that without any of your patches.  Might
> > > be hypervisor preemption, I guess.
> > 
> > Okay it makes the warnings go away for me, but I'm just booting then
> > leaving the system idle. You're doing some CPU hotplug activity?  
> 
> Okay found a bug in the patch (it was not forwarding properly before
> adding the first timer after an idle) and a few other concerns.
> 
> There's still a problem of a timer function doing a mod timer from
> within expire_timers. It can't forward the base, which might currently
> be quite a way behind. I *think* after we close these gaps and get
> timely wakeups for timers on there, it should not get too far behind
> for standard timers.
> 
> Deferrable is a different story. Firstly it has no idle tracking so we
> never forward it. Even if we wanted to, we can't do it reliably because
> it could contain timers way behind the base. They are "deferrable", so
> you get what you pay for, but this still means there's a window where
> you can add a deferrable timer and get a far later expiry than you
> asked for despite the CPU never going idle after you added it.
> 
> All these problems would seem to go away if mod_timer just queued up
> the timer to a single list on the base then pushed them into the
> wheel during your wheel processing softirq... Although maybe you end
> up with excessive passes over big queue of timers. Anyway that
> wouldn't be suitable for 4.13 even if it could work.
> 
> I'll send out an updated minimal fix after some more testing...

Hi All,

I'm back in the office with hardware access on our D05 64 core ARM64
boards.

I think we still have by far the quickest test cases for this so
feel free to ping me anything you want tested quickly (we were
looking at an average of less than 10 minutes to trigger
with machine idling).

Nick, I'm currently running your previous version and we are over an
hour so even without any instances of the issue so it looks like a
considerable improvement.  I'll see if I can line a couple of boards
up for an overnight run if you have your updated version out by then.

Be great to finally put this one to bed.

Thanks,

Jonathan

> 
> Thanks,
> Nick



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-21 Thread Nicholas Piggin
On Mon, 21 Aug 2017 10:52:58 +1000
Nicholas Piggin  wrote:

> On Sun, 20 Aug 2017 14:14:29 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:  
> > > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > > Nicholas Piggin  wrote:
> > > > 
> > > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > > > > 
> > > > > 
> > > > > So I did some digging, and what you find is that rcu_sched seems to 
> > > > > do a
> > > > > simple scheudle_timeout(1) and just goes out to lunch for many 
> > > > > seconds.
> > > > > The process_timeout timer never fires (when it finally does wake after
> > > > > one of these events, it usually removes the timer with 
> > > > > del_timer_sync).
> > > > > 
> > > > > So this patch seems to fix it. Testing, comments welcome.
> > > > 
> > > > Okay this had a problem of trying to forward the timer from a timer
> > > > callback function.
> > > > 
> > > > This was my other approach which also fixes the RCU warnings, but it's
> > > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > > only when needed, it probably saves a load).
> > > 
> > > Giving this one a whirl!
> > 
> > No joy here, but then again there are other reasons to believe that I
> > am seeing a different bug than Dave and Jonathan are.
> > 
> > OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> > is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> > not statistically different from what I see without either patch.
> > 
> > But no statistical difference compared to without patch, and I still
> > see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> > by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> > Hmmm...  I am also seeing that without any of your patches.  Might
> > be hypervisor preemption, I guess.  
> 
> Okay it makes the warnings go away for me, but I'm just booting then
> leaving the system idle. You're doing some CPU hotplug activity?

Okay found a bug in the patch (it was not forwarding properly before
adding the first timer after an idle) and a few other concerns.

There's still a problem of a timer function doing a mod timer from
within expire_timers. It can't forward the base, which might currently
be quite a way behind. I *think* after we close these gaps and get
timely wakeups for timers on there, it should not get too far behind
for standard timers.

Deferrable is a different story. Firstly it has no idle tracking so we
never forward it. Even if we wanted to, we can't do it reliably because
it could contain timers way behind the base. They are "deferrable", so
you get what you pay for, but this still means there's a window where
you can add a deferrable timer and get a far later expiry than you
asked for despite the CPU never going idle after you added it.

All these problems would seem to go away if mod_timer just queued up
the timer to a single list on the base then pushed them into the
wheel during your wheel processing softirq... Although maybe you end
up with excessive passes over big queue of timers. Anyway that
wouldn't be suitable for 4.13 even if it could work.

I'll send out an updated minimal fix after some more testing...

Thanks,
Nick


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-20 Thread Nicholas Piggin
On Sun, 20 Aug 2017 14:14:29 -0700
"Paul E. McKenney"  wrote:

> On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:  
> > > On Sun, 20 Aug 2017 14:45:53 +1000
> > > Nicholas Piggin  wrote:
> > >   
> > > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > > "Paul E. McKenney"  wrote:  
> > > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > Thomas, John, am I misinterpreting the timer trace event messages?
> > > > 
> > > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > > The process_timeout timer never fires (when it finally does wake after
> > > > one of these events, it usually removes the timer with del_timer_sync).
> > > > 
> > > > So this patch seems to fix it. Testing, comments welcome.  
> > > 
> > > Okay this had a problem of trying to forward the timer from a timer
> > > callback function.
> > > 
> > > This was my other approach which also fixes the RCU warnings, but it's
> > > a little more complex. I reworked it a bit so the mod_timer fast path
> > > hopefully doesn't have much more overhead (actually by reading jiffies
> > > only when needed, it probably saves a load).  
> > 
> > Giving this one a whirl!  
> 
> No joy here, but then again there are other reasons to believe that I
> am seeing a different bug than Dave and Jonathan are.
> 
> OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
> is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
> not statistically different from what I see without either patch.
> 
> But no statistical difference compared to without patch, and I still
> see the "rcu_sched kthread starved" messages.  For whatever it is worth,
> by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
> Hmmm...  I am also seeing that without any of your patches.  Might
> be hypervisor preemption, I guess.

Okay it makes the warnings go away for me, but I'm just booting then
leaving the system idle. You're doing some CPU hotplug activity?

Thanks,
Nick


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-20 Thread Paul E. McKenney
On Sun, Aug 20, 2017 at 11:35:14AM -0700, Paul E. McKenney wrote:
> On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> > On Sun, 20 Aug 2017 14:45:53 +1000
> > Nicholas Piggin  wrote:
> > 
> > > On Wed, 16 Aug 2017 09:27:31 -0700
> > > "Paul E. McKenney"  wrote:
> > > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > > 
> > > So I did some digging, and what you find is that rcu_sched seems to do a
> > > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > > The process_timeout timer never fires (when it finally does wake after
> > > one of these events, it usually removes the timer with del_timer_sync).
> > > 
> > > So this patch seems to fix it. Testing, comments welcome.
> > 
> > Okay this had a problem of trying to forward the timer from a timer
> > callback function.
> > 
> > This was my other approach which also fixes the RCU warnings, but it's
> > a little more complex. I reworked it a bit so the mod_timer fast path
> > hopefully doesn't have much more overhead (actually by reading jiffies
> > only when needed, it probably saves a load).
> 
> Giving this one a whirl!

No joy here, but then again there are other reasons to believe that I
am seeing a different bug than Dave and Jonathan are.

OK, not -entirely- without joy -- 10 of 14 runs were error-free, which
is a good improvement over 0 of 84 for your earlier patch.  ;-)  But
not statistically different from what I see without either patch.

But no statistical difference compared to without patch, and I still
see the "rcu_sched kthread starved" messages.  For whatever it is worth,
by the way, I also see this: "hrtimer: interrupt took 5712368 ns".
Hmmm...  I am also seeing that without any of your patches.  Might
be hypervisor preemption, I guess.

Thanx, Paul

PS.  I will be off the grid for the next day or so.  Eclipse day here...



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-20 Thread Paul E. McKenney
On Sun, Aug 20, 2017 at 11:00:40PM +1000, Nicholas Piggin wrote:
> On Sun, 20 Aug 2017 14:45:53 +1000
> Nicholas Piggin  wrote:
> 
> > On Wed, 16 Aug 2017 09:27:31 -0700
> > "Paul E. McKenney"  wrote:
> > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > > 
> > > Thomas, John, am I misinterpreting the timer trace event messages?  
> > 
> > So I did some digging, and what you find is that rcu_sched seems to do a
> > simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> > The process_timeout timer never fires (when it finally does wake after
> > one of these events, it usually removes the timer with del_timer_sync).
> > 
> > So this patch seems to fix it. Testing, comments welcome.
> 
> Okay this had a problem of trying to forward the timer from a timer
> callback function.
> 
> This was my other approach which also fixes the RCU warnings, but it's
> a little more complex. I reworked it a bit so the mod_timer fast path
> hopefully doesn't have much more overhead (actually by reading jiffies
> only when needed, it probably saves a load).

Giving this one a whirl!

Thanx, Paul

> Thanks,
> Nick
> 
> --
> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle
> 
> When a timer base is idle, it is forwarded when a new timer is added to
> ensure that granularity does not become excessive. When not idle, the
> timer tick is expected to increment the base.
> 
> However there is a window after a timer is restarted from nohz, when it
> is marked not-idle, and before the timer tick on this CPU, where a timer
> may be added on an ancient base that does not get forwarded (beacause
> the timer appears not-idle).
> 
> This results in excessive granularity. So much so that a 1 jiffy timeout
> has blown out to 10s of seconds and triggered the RCU stall warning
> detector.
> 
> Fix this by keeping track of whether the timer has been idle since it was
> last run or forwarded, and allow forwarding in the case that is true (even
> if it is not currently idle).
> 
> Also add a comment noting a case where we could get an unexpectedly
> large granularity for a timer. I debugged this problem by adding
> warnings for such cases, but it seems we can't add them in general due
> to this corner case.
> 
> Signed-off-by: Nicholas Piggin 
> ---
>  kernel/time/timer.c | 32 +++-
>  1 file changed, 27 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 8f5d1bf18854..ee7b8b688b48 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -203,6 +203,7 @@ struct timer_base {
>   boolmigration_enabled;
>   boolnohz_active;
>   boolis_idle;
> + boolwas_idle;
>   DECLARE_BITMAP(pending_map, WHEEL_SIZE);
>   struct hlist_head   vectors[WHEEL_SIZE];
>  } cacheline_aligned;
> @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned 
> tflags)
> 
>  static inline void forward_timer_base(struct timer_base *base)
>  {
> - unsigned long jnow = READ_ONCE(jiffies);
> + unsigned long jnow;
> 
>   /*
> -  * We only forward the base when it's idle and we have a delta between
> -  * base clock and jiffies.
> +  * We only forward the base when we are idle or have just come out
> +  * of idle (was_idle logic), and have a delta between base clock
> +  * and jiffies. In the common case, run_timers will take care of it.
>*/
> - if (!base->is_idle || (long) (jnow - base->clk) < 2)
> + if (likely(!base->was_idle))
> + return;
> +
> + jnow = READ_ONCE(jiffies);
> + base->was_idle = base->is_idle;
> + if ((long)(jnow - base->clk) < 2)
>   return;
> 
>   /*
> @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long 
> expires, bool pending_only)
>* same array bucket then just return:
>*/
>   if (timer_pending(timer)) {
> + /*
> +  * The downside of this optimization is that it can result in
> +  * larger granularity than you would get from adding a new
> +  * timer with this expiry. Would a timer flag for networking
> +  * be appropriate, then we can try to keep expiry of general
> +  * timers within ~1/8th of their interval?
> +  */
>   if (timer->expires == expires)
>   return 1;
> 
> @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 
> basem)
>   /*
>* If we expect to sleep more than a tick, mark the base idle:
>*/
> - if ((expires - basem) > TICK_NSEC)
> + if ((expires - basem) > TICK_NSEC) {
> + base->was_idle = true;
>   

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-20 Thread Nicholas Piggin
On Sun, 20 Aug 2017 14:45:53 +1000
Nicholas Piggin  wrote:

> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney"  wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?  
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Okay this had a problem of trying to forward the timer from a timer
callback function.

This was my other approach which also fixes the RCU warnings, but it's
a little more complex. I reworked it a bit so the mod_timer fast path
hopefully doesn't have much more overhead (actually by reading jiffies
only when needed, it probably saves a load).

Thanks,
Nick

--
[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by keeping track of whether the timer has been idle since it was
last run or forwarded, and allow forwarding in the case that is true (even
if it is not currently idle).

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin 
---
 kernel/time/timer.c | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..ee7b8b688b48 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -203,6 +203,7 @@ struct timer_base {
boolmigration_enabled;
boolnohz_active;
boolis_idle;
+   boolwas_idle;
DECLARE_BITMAP(pending_map, WHEEL_SIZE);
struct hlist_head   vectors[WHEEL_SIZE];
 } cacheline_aligned;
@@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags)
 
 static inline void forward_timer_base(struct timer_base *base)
 {
-   unsigned long jnow = READ_ONCE(jiffies);
+   unsigned long jnow;
 
/*
-* We only forward the base when it's idle and we have a delta between
-* base clock and jiffies.
+* We only forward the base when we are idle or have just come out
+* of idle (was_idle logic), and have a delta between base clock
+* and jiffies. In the common case, run_timers will take care of it.
 */
-   if (!base->is_idle || (long) (jnow - base->clk) < 2)
+   if (likely(!base->was_idle))
+   return;
+
+   jnow = READ_ONCE(jiffies);
+   base->was_idle = base->is_idle;
+   if ((long)(jnow - base->clk) < 2)
return;
 
/*
@@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long 
expires, bool pending_only)
 * same array bucket then just return:
 */
if (timer_pending(timer)) {
+   /*
+* The downside of this optimization is that it can result in
+* larger granularity than you would get from adding a new
+* timer with this expiry. Would a timer flag for networking
+* be appropriate, then we can try to keep expiry of general
+* timers within ~1/8th of their interval?
+*/
if (timer->expires == expires)
return 1;
 
@@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 
basem)
/*
 * If we expect to sleep more than a tick, mark the base idle:
 */
-   if ((expires - basem) > TICK_NSEC)
+   if ((expires - basem) > TICK_NSEC) {
+   base->was_idle = true;
base->is_idle = true;
+   }
}
raw_spin_unlock(>lock);
 
@@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base)
struct hlist_head heads[LVL_DEPTH];
int levels;
 
+   /*
+* was_idle must be cleared before running 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-19 Thread Paul E. McKenney
On Sun, Aug 20, 2017 at 02:45:53PM +1000, Nicholas Piggin wrote:
> On Wed, 16 Aug 2017 09:27:31 -0700
> "Paul E. McKenney"  wrote:
> > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> > 
> > Thomas, John, am I misinterpreting the timer trace event messages?
> 
> So I did some digging, and what you find is that rcu_sched seems to do a
> simple scheudle_timeout(1) and just goes out to lunch for many seconds.
> The process_timeout timer never fires (when it finally does wake after
> one of these events, it usually removes the timer with del_timer_sync).
> 
> So this patch seems to fix it. Testing, comments welcome.

Fired up some tests, which should complete in about 12 hours.

Here is hoping!  ;-)

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-19 Thread David Miller
From: Nicholas Piggin 
Date: Sun, 20 Aug 2017 14:45:53 +1000

> [PATCH] timers: Fix excessive granularity of new timers after a nohz idle

I just booted into this on my sparc64 box, let's see if it does the
trick :-)


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-19 Thread Nicholas Piggin
On Wed, 16 Aug 2017 09:27:31 -0700
"Paul E. McKenney"  wrote:
> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> 
> Thomas, John, am I misinterpreting the timer trace event messages?

So I did some digging, and what you find is that rcu_sched seems to do a
simple scheudle_timeout(1) and just goes out to lunch for many seconds.
The process_timeout timer never fires (when it finally does wake after
one of these events, it usually removes the timer with del_timer_sync).

So this patch seems to fix it. Testing, comments welcome.

Thanks,
Nick

[PATCH] timers: Fix excessive granularity of new timers after a nohz idle

When a timer base is idle, it is forwarded when a new timer is added to
ensure that granularity does not become excessive. When not idle, the
timer tick is expected to increment the base.

However there is a window after a timer is restarted from nohz, when it
is marked not-idle, and before the timer tick on this CPU, where a timer
may be added on an ancient base that does not get forwarded (beacause
the timer appears not-idle).

This results in excessive granularity. So much so that a 1 jiffy timeout
has blown out to 10s of seconds and triggered the RCU stall warning
detector.

Fix this by always forwarding the base when adding a new timer if it is
more than 1 jiffy behind. Another approach I looked at first was to note
if the base was idle but not yet run or forwarded, however this just
seemed to add more branches and complexity when it seems we can just
cover it with this test.

Also add a comment noting a case where we could get an unexpectedly
large granularity for a timer. I debugged this problem by adding
warnings for such cases, but it seems we can't add them in general due
to this corner case.

Signed-off-by: Nicholas Piggin 
---
 kernel/time/timer.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 8f5d1bf18854..8f69b3105b8f 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -859,10 +859,10 @@ static inline void forward_timer_base(struct timer_base 
*base)
unsigned long jnow = READ_ONCE(jiffies);
 
/*
-* We only forward the base when it's idle and we have a delta between
-* base clock and jiffies.
+* We only forward the base when we have a delta between base clock
+* and jiffies. In the common case, run_timers will take care of it.
 */
-   if (!base->is_idle || (long) (jnow - base->clk) < 2)
+   if ((long) (jnow - base->clk) < 2)
return;
 
/*
@@ -938,6 +938,13 @@ __mod_timer(struct timer_list *timer, unsigned long 
expires, bool pending_only)
 * same array bucket then just return:
 */
if (timer_pending(timer)) {
+   /*
+* The downside of this optimization is that it can result in
+* larger granularity than you would get from adding a new
+* timer with this expiry. Would a timer flag for networking
+* be appropriate, then we can try to keep expiry of general
+* timers within ~1/8th of their interval?
+*/
if (timer->expires == expires)
return 1;
 
-- 
2.13.3



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-17 Thread Michael Ellerman
"Paul E. McKenney"  writes:

> On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
>> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
>> > "Paul E. McKenney"  writes:
>> > ...
>> > >
>> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
>> > > Author: Paul E. McKenney 
>> > > Date:   Mon Aug 14 08:54:39 2017 -0700
>> > >
>> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
>> > > 
>> > > Signed-off-by: Paul E. McKenney 
>> > >
>> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>> > > index c7a899c5ce64..7358a5073dfb 100644
>> > > --- a/kernel/time/tick-sched.c
>> > > +++ b/kernel/time/tick-sched.c
>> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
>> > > tick_sched *ts,
>> > >   * (not only the tick).
>> > >   */
>> > >  ts->sleep_length = ktime_sub(dev->next_event, now);
>> > > +trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - 
>> > > ktime_get()) / 1000);
>> > >  return tick;
>> > >  }
>> > 
>> > Should I be seeing negative values? A small sample:
>> 
>> Maybe due to hypervisor preemption delays, but I confess that I am
>> surprised to see them this large.  1,602,250,019 microseconds is something
>> like a half hour, which could result in stall warnings all by itself.

Hmm. This is a bare metal machine. So no hypervisor.

>> I will take a look!
>
> And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
> kthread that stalled.  This kthread was starved, based on this from your
> dmesg:
>
> [ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 
> RCU_GP_WAIT_FQS(3) ->state=0x1
>
> The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
> idle-CPU and offline-CPU quiescent states, which means that its waits
> will be accompanied by short timeouts.  The "starved for 2603 jiffies"
> says that it has not run for one good long time.  The ->state is its
> task_struct ->state field.
>
> The immediately preceding dmesg line is as follows:
>
> [ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)
>
> In other words, the rcu_sched grace-period kthread has been starved
> for the entire duration of the current grace period, as shown by the
> t=2603.
>
> Lets turn now to the trace output, looking for the last bit of the
> rcu_sched task's activity:
>
>rcu_sched-9 [054] d...  1576.030096: timer_start: 
> timer=c007fae1bc20 function=process_timeout expires=4295094922 
> [timeout=1] cpu=54 idx=0 flags=
> ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: 
> rcu_sched rhp=c00fcf8c4eb0 func=__d_free
>rcu_sched-9 [054] d...  1576.030097: rcu_utilization: Start 
> context switch
> ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: 
> rcu_sched rhp=c00fcff74ee0 func=proc_i_callback
>rcu_sched-9 [054] d...  1576.030098: rcu_grace_period: rcu_sched 
> 7275 cpuqs
>rcu_sched-9 [054] d...  1576.030099: rcu_utilization: End context 
> switch
>
> So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
> but what is with "expires=4295094922"?)

That's a good one.

I have HZ=100, and therefore:

INITIAL_JIFFIES = (1 << 32) - (300 * 100) = 4294937296

So the expires value of 4295094922 is:

4295094922 - 4294937296 = 157626

Jiffies since boot.

Or 1576,260,000,000 ns == 1576.26 s.

> Of course, the timer will have expired in the context of some other task,
> but a search for "c007fae1bc20" (see the "timer=" in the first trace
> line above) shows nothing (to be painfully accurate, the search wraps back
> to earlier uses of this timer by rcu_sched).  So the timer never did fire.

Or it just wasn't in the trace ?

I'll try and get it to trace a bit longer and see if that is helpful.

cheers


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney"  writes:
> > ...
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney 
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > > tick_sched *ts,
> > >* (not only the tick).
> > >*/
> > >   ts->sleep_length = ktime_sub(dev->next_event, now);
> > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > > / 1000);
> > >   return tick;
> > >  }
> > 
> > Should I be seeing negative values? A small sample:
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250019
> >   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: 9973
> >   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250027
> >   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250053
> >   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250055
> >   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.
> 
> I will take a look!

And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
kthread that stalled.  This kthread was starved, based on this from your
dmesg:

[ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 
RCU_GP_WAIT_FQS(3) ->state=0x1

The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
idle-CPU and offline-CPU quiescent states, which means that its waits
will be accompanied by short timeouts.  The "starved for 2603 jiffies"
says that it has not run for one good long time.  The ->state is its
task_struct ->state field.

The immediately preceding dmesg line is as follows:

[ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)

In other words, the rcu_sched grace-period kthread has been starved
for the entire duration of the current grace period, as shown by the
t=2603.

Lets turn now to the trace output, looking for the last bit of the
rcu_sched task's activity:

   rcu_sched-9 [054] d...  1576.030096: timer_start: 
timer=c007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] 
cpu=54 idx=0 flags=
ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched 
rhp=c00fcf8c4eb0 func=__d_free
   rcu_sched-9 [054] d...  1576.030097: rcu_utilization: Start context 
switch
ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched 
rhp=c00fcff74ee0 func=proc_i_callback
   rcu_sched-9 [054] d...  1576.030098: rcu_grace_period: rcu_sched 
7275 cpuqs
   rcu_sched-9 [054] d...  1576.030099: rcu_utilization: End context 
switch

So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
but what is with "expires=4295094922"?)  and blocked ("rcu_utilization:
Start context switch" and "rcu_utilization: End context switch"),
recording its CPU's quiescent state in the process ("rcu_grace_period:
rcu_sched 7275 cpuqs").

Of course, the timer will have expired in the context of some other task,
but a search for "c007fae1bc20" (see the "timer=" in the first trace
line above) shows nothing (to be painfully accurate, the search wraps back
to earlier uses of this timer by rcu_sched).  So the timer never did fire.

The next question is "what did CPU 054 do next?"  We find it entering idle:

  -0 [054] d...  1576.030167: tick_stop: success=1 
dependency=NONE
  -0 [054] d...  1576.030167: hrtimer_cancel: 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Nicholas Piggin
On Wed, 16 Aug 2017 05:56:17 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney"  writes:
> > ...  
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney 
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > > tick_sched *ts,
> > >* (not only the tick).
> > >*/
> > >   ts->sleep_length = ktime_sub(dev->next_event, now);
> > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > > / 1000);
> > >   return tick;
> > >  }  
> > 
> > Should I be seeing negative values? A small sample:  
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250019
> >   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: 9973
> >   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250027
> >   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250053
> >   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250055
> >   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.  
> 
> I will take a look!

I found this, I can't see that it would cause our symptoms, but it's
worth someone who knows the code taking a look at it.

--
cpuidle: fix broadcast control when broadcast can not be entered

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases, but otherwise does not appear to cause problems.

Signed-off-by: Nicholas Piggin 
---
 drivers/cpuidle/cpuidle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 60bb64f4329d..4453e27f855e 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct 
cpuidle_driver *drv,
return -EBUSY;
}
target_state = >states[index];
+   broadcast = false;
}
 
/* Take note of the planned idle state. */
-- 
2.13.3



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney"  writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney 
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > 
> > Signed-off-by: Paul E. McKenney 
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > tick_sched *ts,
> >  * (not only the tick).
> >  */
> > ts->sleep_length = ktime_sub(dev->next_event, now);
> > +   trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > / 1000);
> > return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250019
>   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250025
>   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250025
>   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: 9973
>   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250027
>   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250053
>   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250055
>   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
>   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
>   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Michael Ellerman
"Paul E. McKenney"  writes:
...
>
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney 
> Date:   Mon Aug 14 08:54:39 2017 -0700
>
> EXP: Trace tick return from tick_nohz_stop_sched_tick
> 
> Signed-off-by: Paul E. McKenney 
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> tick_sched *ts,
>* (not only the tick).
>*/
>   ts->sleep_length = ktime_sub(dev->next_event, now);
> + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> / 1000);
>   return tick;
>  }

Should I be seeing negative values? A small sample:

  -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250019
  -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250025
  -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250025
  -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: 9973
  -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250027
  -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250053
  -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250055
  -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018
  -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018
  -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018


I have a full trace, I'll send it to you off-list.

cheers


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-15 Thread Jonathan Cameron
On Tue, 15 Aug 2017 08:47:43 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> > On Tue, 1 Aug 2017 11:46:46 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:  
> > > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > > "Paul E. McKenney"  wrote:
> > > > > >   
> > > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote: 
> > > > > > >  
> > > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > > "Paul E. McKenney"  wrote:
> > > > > > > > 
> > > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron 
> > > > > > > > > wrote:
> > > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > > Jonathan Cameron  wrote:   
> > > > > > > > > >
> > > > > > > > > 
> > > > > > > > > [ . . . ]
> > > > > > > > > 
> > > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in 
> > > > > > > > > > the interest of having
> > > > > > > > > > fewer timer events.  Issue became much easier to trigger 
> > > > > > > > > > (on some runs before
> > > > > > > > > > I could get tracing up and running)
> > > > > > > > > >e
> > > > > > > > > > So logs are large enough that pastebin doesn't like them - 
> > > > > > > > > > please shoet if  
> > > > > > > > > >>e another timer period is of interest.  
> > > > > > > > > > 
> > > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > > 
> > > > > > > > > > The relevant timeout on the RCU stall detector was 8 
> > > > > > > > > > seconds.  Event is
> > > > > > > > > > detected around 835.
> > > > > > > > > > 
> > > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun 
> > > > > > > > > > yet but there
> > > > > > > > > > may well be one in there.  
> > > > > > > > > 
> > > > > > > > > The dmesg says:
> > > > > > > > > 
> > > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > > > > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > > 
> > > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > > 
> > > > > > > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > > > > > > timer=8017d5fc7da0
> > > > > > > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > > > > 
> > > > > > > > > Next look for "8017d5fc7da0" and I don't find anything 
> > > > > > > > > else.
> > > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > > 
> > > > > > > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > > > > > > timer=8017d5fc7da0
> > > > > > > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > > > > > > timer=8017d5fc7da0
> > > > > > > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > > > expires=4295101298 [timeout=1] cpu=1 idx=0 flags=
> > > > > > > 
> > > > > > > Odd.  I would expect an expiration...  And ten seconds is way 
> > > > > > > longer
> > > > > > > than the requested one jiffy!
> > > > > > >   
> > > > > > > > > The timeout was one jiffy, and more than a second later, no 
> > > > > > > > > expiration.
> > > > > > > > > Is it possible that this event was lost?  I am not seeing any 
> > > > > > > > > sign of
> > > > > > > > > this is the trace.
> > > > > > > > > 
> > > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of 
> > > > > > > > > that in
> > > > > > > > > any case).
> > > > > > > > > 
> > > > > > > > > The last time we saw something like this it was a timer 
> > > > > > > > > HW/driver problem,
> > > > > > > > > but it is a bit hard to imagine such a problem affecting both 
> > > > > > > > > ARM64
> > > > > > > > > and SPARC.  ;-)
> > > > > > > > Could be different issues, both of which were hidden by that 
> > > > > > > > lockup detector.
> > > > > > > > 
> > > > > > > > There is an errata work around for the timers on this 
> > > > > > > > particular board.
> > > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > > 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > > 
> > > > > > > > Seems unlikely though! + we've not yet seen it on the other 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-15 Thread Paul E. McKenney
On Wed, Aug 02, 2017 at 05:25:55PM +0100, Jonathan Cameron wrote:
> On Tue, 1 Aug 2017 11:46:46 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > > On Mon, 31 Jul 2017 08:04:11 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > > 
> > > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:   
> > > > > >  
> > > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > > "Paul E. McKenney"  wrote:
> > > > > > >   
> > > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron 
> > > > > > > > wrote:  
> > > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > > Jonathan Cameron  wrote:
> > > > > > > > 
> > > > > > > > [ . . . ]
> > > > > > > >   
> > > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the 
> > > > > > > > > interest of having
> > > > > > > > > fewer timer events.  Issue became much easier to trigger (on 
> > > > > > > > > some runs before
> > > > > > > > > I could get tracing up and running)
> > > > > > > > >e
> > > > > > > > > So logs are large enough that pastebin doesn't like them - 
> > > > > > > > > please shoet if
> > > > > > > > >>e another timer period is of interest.
> > > > > > > > > 
> > > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > > 
> > > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds. 
> > > > > > > > >  Event is
> > > > > > > > > detected around 835.
> > > > > > > > > 
> > > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet 
> > > > > > > > > but there
> > > > > > > > > may well be one in there.
> > > > > > > > 
> > > > > > > > The dmesg says:
> > > > > > > > 
> > > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > > > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > > 
> > > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > > 
> > > > > > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > > > > > timer=8017d5fc7da0
> > > > > > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > > > 
> > > > > > > > Next look for "8017d5fc7da0" and I don't find anything 
> > > > > > > > else.  
> > > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > > 
> > > > > > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > > > > > timer=8017d5fc7da0
> > > > > > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > > > > > timer=8017d5fc7da0
> > > > > > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > > expires=4295101298 [timeout=1] cpu=1 idx=0 flags=  
> > > > > > 
> > > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > > than the requested one jiffy!
> > > > > > 
> > > > > > > > The timeout was one jiffy, and more than a second later, no 
> > > > > > > > expiration.
> > > > > > > > Is it possible that this event was lost?  I am not seeing any 
> > > > > > > > sign of
> > > > > > > > this is the trace.
> > > > > > > > 
> > > > > > > > I don't see any sign of CPU hotplug (and I test with lots of 
> > > > > > > > that in
> > > > > > > > any case).
> > > > > > > > 
> > > > > > > > The last time we saw something like this it was a timer 
> > > > > > > > HW/driver problem,
> > > > > > > > but it is a bit hard to imagine such a problem affecting both 
> > > > > > > > ARM64
> > > > > > > > and SPARC.  ;-)  
> > > > > > > Could be different issues, both of which were hidden by that 
> > > > > > > lockup detector.
> > > > > > > 
> > > > > > > There is an errata work around for the timers on this particular 
> > > > > > > board.
> > > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > > 
> > > > > > > Seems unlikely though! + we've not yet seen it on the other chips 
> > > > > > > that
> > > > > > > errata effects (not that that means much).  
> > > > > > 
> > > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > > 
> > > > > > Thanx, Paul
> > > > > Errata fix is running already and was for all those tests.
> > > > 
> > 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-02 Thread Jonathan Cameron
On Tue, 1 Aug 2017 11:46:46 -0700
"Paul E. McKenney"  wrote:

> On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> > On Mon, 31 Jul 2017 08:04:11 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > > "Paul E. McKenney"  wrote:
> > > > > >   
> > > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote: 
> > > > > > >  
> > > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > > Jonathan Cameron  wrote:
> > > > > > > 
> > > > > > > [ . . . ]
> > > > > > >   
> > > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the 
> > > > > > > > interest of having
> > > > > > > > fewer timer events.  Issue became much easier to trigger (on 
> > > > > > > > some runs before
> > > > > > > > I could get tracing up and running)
> > > > > > > >e
> > > > > > > > So logs are large enough that pastebin doesn't like them - 
> > > > > > > > please shoet if
> > > > > > > >>e another timer period is of interest.
> > > > > > > > 
> > > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > > 
> > > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  
> > > > > > > > Event is
> > > > > > > > detected around 835.
> > > > > > > > 
> > > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet 
> > > > > > > > but there
> > > > > > > > may well be one in there.
> > > > > > > 
> > > > > > > The dmesg says:
> > > > > > > 
> > > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > > 
> > > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > > 
> > > > > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > > > > timer=8017d5fc7da0
> > > > > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > > 
> > > > > > > Next look for "8017d5fc7da0" and I don't find anything else.  
> > > > > > > 
> > > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > > 
> > > > > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > > > > timer=8017d5fc7da0
> > > > > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > > > > timer=8017d5fc7da0
> > > > > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > > > > timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> > > > > > [timeout=1] cpu=1 idx=0 flags=  
> > > > > 
> > > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > > than the requested one jiffy!
> > > > > 
> > > > > > > The timeout was one jiffy, and more than a second later, no 
> > > > > > > expiration.
> > > > > > > Is it possible that this event was lost?  I am not seeing any 
> > > > > > > sign of
> > > > > > > this is the trace.
> > > > > > > 
> > > > > > > I don't see any sign of CPU hotplug (and I test with lots of that 
> > > > > > > in
> > > > > > > any case).
> > > > > > > 
> > > > > > > The last time we saw something like this it was a timer HW/driver 
> > > > > > > problem,
> > > > > > > but it is a bit hard to imagine such a problem affecting both 
> > > > > > > ARM64
> > > > > > > and SPARC.  ;-)  
> > > > > > Could be different issues, both of which were hidden by that lockup 
> > > > > > detector.
> > > > > > 
> > > > > > There is an errata work around for the timers on this particular 
> > > > > > board.
> > > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > > 
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > > 
> > > > > > Seems unlikely though! + we've not yet seen it on the other chips 
> > > > > > that
> > > > > > errata effects (not that that means much).  
> > > > > 
> > > > > If you can reproduce quickly, might be worth trying anyway...
> > > > > 
> > > > >   Thanx, Paul
> > > > Errata fix is running already and was for all those tests.
> > > 
> > > I was afraid of that...  ;-)  
> > It's a pretty rare errata it seems.  Not actually managed to catch
> > one yet.   
> > >   
> > > > I'll have a dig into the timers today and see where I get to.
> > > 
> > > Look forward to seeing what you find!  
> > Nothing obvious turning 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-01 Thread Paul E. McKenney
On Mon, Jul 31, 2017 at 04:27:57PM +0100, Jonathan Cameron wrote:
> On Mon, 31 Jul 2017 08:04:11 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 12:03:50 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > > 
> > > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:   
> > > > > >  
> > > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > > Jonathan Cameron  wrote:  
> > > > > > 
> > > > > > [ . . . ]
> > > > > > 
> > > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the 
> > > > > > > interest of having
> > > > > > > fewer timer events.  Issue became much easier to trigger (on some 
> > > > > > > runs before
> > > > > > > I could get tracing up and running)
> > > > > > >e
> > > > > > > So logs are large enough that pastebin doesn't like them - please 
> > > > > > > shoet if  
> > > > > > >>e another timer period is of interest.  
> > > > > > > 
> > > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > > 
> > > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  
> > > > > > > Event is
> > > > > > > detected around 835.
> > > > > > > 
> > > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but 
> > > > > > > there
> > > > > > > may well be one in there.  
> > > > > > 
> > > > > > The dmesg says:
> > > > > > 
> > > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > > 
> > > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > > 
> > > > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > > > timer=8017d5fc7da0
> > > > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > > 
> > > > > > Next look for "8017d5fc7da0" and I don't find anything else.
> > > > > It does show up off the bottom of what would fit in pastebin...
> > > > > 
> > > > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > > > timer=8017d5fc7da0
> > > > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > > > timer=8017d5fc7da0
> > > > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > > > timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> > > > > [timeout=1] cpu=1 idx=0 flags=
> > > > 
> > > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > > than the requested one jiffy!
> > > >   
> > > > > > The timeout was one jiffy, and more than a second later, no 
> > > > > > expiration.
> > > > > > Is it possible that this event was lost?  I am not seeing any sign 
> > > > > > of
> > > > > > this is the trace.
> > > > > > 
> > > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > > any case).
> > > > > > 
> > > > > > The last time we saw something like this it was a timer HW/driver 
> > > > > > problem,
> > > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > > and SPARC.  ;-)
> > > > > Could be different issues, both of which were hidden by that lockup 
> > > > > detector.
> > > > > 
> > > > > There is an errata work around for the timers on this particular 
> > > > > board.
> > > > > I'm only vaguely aware of it, so may be unconnected.
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > > 
> > > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > > errata effects (not that that means much).
> > > > 
> > > > If you can reproduce quickly, might be worth trying anyway...
> > > > 
> > > > Thanx, Paul  
> > > Errata fix is running already and was for all those tests.  
> > 
> > I was afraid of that...  ;-)
> It's a pretty rare errata it seems.  Not actually managed to catch
> one yet. 
> > 
> > > I'll have a dig into the timers today and see where I get to.  
> > 
> > Look forward to seeing what you find!
> Nothing obvious turning up other than we don't seem to have issue
> when we aren't running hrtimers.
> 
> On a plus side I just got a report that it is effecting our d03
> boards which is good on the basis I couldn't tell what the difference
> could be wrt to this issue!
> 
> It indeed looks like we are consistently missing a timer before
> the rcu splat occurs.

And for my part, my tests with CONFIG_HZ_PERIODIC=y and

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-01 Thread Jonathan Cameron

Sorry - accidental send.  No content!

Jonathan

On Mon, 31 Jul 2017 12:55:48 +0100
Jonathan Cameron  wrote:

> On Mon, 31 Jul 2017 12:09:08 +0100
> Jonathan Cameron  wrote:
> 
> > On Wed, 26 Jul 2017 16:15:05 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> > > > From: "Paul E. McKenney" 
> > > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > > >   
> > > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > > CPU stall warnings.  So it seems likely that 
> > > > > CONFIG_SOFTLOCKUP_DETECTOR
> > > > > really is having an effect.  
> > > > 
> > > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > > code myself.
> > > > 
> > > > Just out of curiousity, what x86 idle method is your machine using?
> > > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > > Jonathan's system operates.  
> > > 
> > > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > > 
> > >   CONFIG_NO_HZ_IDLE=y
> > >   CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > >   # CONFIG_IDLE_PAGE_TRACKING is not set
> > >   CONFIG_ACPI_PROCESSOR_IDLE=y
> > >   CONFIG_CPU_IDLE=y
> > >   # CONFIG_CPU_IDLE_GOV_LADDER is not set
> > >   CONFIG_CPU_IDLE_GOV_MENU=y
> > >   # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > >   # CONFIG_INTEL_IDLE is not set
> > > 
> > > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > > local TICK register keeps advancing, and the local timer therefore
> > > > will still trigger.  Also, any externally generated interrupts
> > > > (including cross calls) will wake up the cpu as well.
> > > > 
> > > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > > case.  One of my running theories is that we miss scheduling a tick
> > > > due to a race.  That would be consistent with the behavior we see
> > > > in the RCU dumps, I think.  
> > > 
> > > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > > warning?  By default, your grace period needs to extend for more than
> > > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > > you mean that the ticks get shut off now and forever, as opposed to
> > > just losing one of them?
> > > 
> > > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > > about the revert of the revert (specifically
> > > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > > 
> > > > :-)  
> > > 
> > > I am running an overnight test in preparation for attempting to push
> > > some fixes for regressions into 4.12, but will try reverting this
> > > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > > 
> > > Jonathan, might the commit that Dave points out above be what reduces
> > > the probability of occurrence as you test older releases?
> > I just got around to trying this out of curiosity.  Superficially it did
> > appear to possibly make the issue harder to hit took over 30 minutes
> > but the issue otherwise looks much the same with or without that patch.
> > 
> > Just out of curiosity, next thing on my list is to disable hrtimers entirely
> > and see what happens.
> > 
> > Jonathan  
> > > 
> > >   Thanx, Paul
> > > 
> > 
> > ___
> > linuxarm mailing list
> > linux...@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-01 Thread Jonathan Cameron
On Mon, 31 Jul 2017 12:09:08 +0100
Jonathan Cameron  wrote:

> On Wed, 26 Jul 2017 16:15:05 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" 
> > > Date: Wed, 26 Jul 2017 15:36:58 -0700
> > > 
> > > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > > really is having an effect.
> > > 
> > > Thanks for all of the info Paul, I'll digest this and scan over the
> > > code myself.
> > > 
> > > Just out of curiousity, what x86 idle method is your machine using?
> > > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > > might mask this bug, and halt would be a lot closer to how sparc64 and
> > > Jonathan's system operates.
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > CONFIG_NO_HZ_IDLE=y
> > CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > # CONFIG_IDLE_PAGE_TRACKING is not set
> > CONFIG_ACPI_PROCESSOR_IDLE=y
> > CONFIG_CPU_IDLE=y
> > # CONFIG_CPU_IDLE_GOV_LADDER is not set
> > CONFIG_CPU_IDLE_GOV_MENU=y
> > # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > # CONFIG_INTEL_IDLE is not set
> >   
> > > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > > local TICK register keeps advancing, and the local timer therefore
> > > will still trigger.  Also, any externally generated interrupts
> > > (including cross calls) will wake up the cpu as well.
> > > 
> > > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > > case.  One of my running theories is that we miss scheduling a tick
> > > due to a race.  That would be consistent with the behavior we see
> > > in the RCU dumps, I think.
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> >   
> > > Anyways, just a theory, and that's why I keep mentioning that commit
> > > about the revert of the revert (specifically
> > > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > > 
> > > :-)
> > 
> > I am running an overnight test in preparation for attempting to push
> > some fixes for regressions into 4.12, but will try reverting this
> > and enabling CONFIG_HZ_PERIODIC tomorrow.
> > 
> > Jonathan, might the commit that Dave points out above be what reduces
> > the probability of occurrence as you test older releases?  
> I just got around to trying this out of curiosity.  Superficially it did
> appear to possibly make the issue harder to hit took over 30 minutes
> but the issue otherwise looks much the same with or without that patch.
> 
> Just out of curiosity, next thing on my list is to disable hrtimers entirely
> and see what happens.
> 
> Jonathan
> > 
> > Thanx, Paul
> >   
> 
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-31 Thread Jonathan Cameron
On Mon, 31 Jul 2017 08:04:11 -0700
"Paul E. McKenney"  wrote:

> On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 12:03:50 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > > Jonathan Cameron  wrote:  
> > > > > 
> > > > > [ . . . ]
> > > > > 
> > > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the 
> > > > > > interest of having
> > > > > > fewer timer events.  Issue became much easier to trigger (on some 
> > > > > > runs before
> > > > > > I could get tracing up and running)
> > > > > >e
> > > > > > So logs are large enough that pastebin doesn't like them - please 
> > > > > > shoet if  
> > > > > >>e another timer period is of interest.  
> > > > > > 
> > > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > > 
> > > > > > The relevant timeout on the RCU stall detector was 8 seconds.  
> > > > > > Event is
> > > > > > detected around 835.
> > > > > > 
> > > > > > It's a lot of logs, so I haven't identified a smoking gun yet but 
> > > > > > there
> > > > > > may well be one in there.  
> > > > > 
> > > > > The dmesg says:
> > > > > 
> > > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > > 
> > > > > So I look for "rcu_preempt" timer events and find these:
> > > > > 
> > > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > > timer=8017d5fc7da0
> > > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > > timer=8017d5fc7da0 function=process_timeout 
> > > > > 
> > > > > Next look for "8017d5fc7da0" and I don't find anything else.
> > > > It does show up off the bottom of what would fit in pastebin...
> > > > 
> > > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > > timer=8017d5fc7da0
> > > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > > timer=8017d5fc7da0
> > > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > > timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> > > > [timeout=1] cpu=1 idx=0 flags=
> > > 
> > > Odd.  I would expect an expiration...  And ten seconds is way longer
> > > than the requested one jiffy!
> > >   
> > > > > The timeout was one jiffy, and more than a second later, no 
> > > > > expiration.
> > > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > > this is the trace.
> > > > > 
> > > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > > any case).
> > > > > 
> > > > > The last time we saw something like this it was a timer HW/driver 
> > > > > problem,
> > > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > > and SPARC.  ;-)
> > > > Could be different issues, both of which were hidden by that lockup 
> > > > detector.
> > > > 
> > > > There is an errata work around for the timers on this particular board.
> > > > I'm only vaguely aware of it, so may be unconnected.
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > > 
> > > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > > errata effects (not that that means much).
> > > 
> > > If you can reproduce quickly, might be worth trying anyway...
> > > 
> > >   Thanx, Paul  
> > Errata fix is running already and was for all those tests.  
> 
> I was afraid of that...  ;-)
It's a pretty rare errata it seems.  Not actually managed to catch
one yet. 
> 
> > I'll have a dig into the timers today and see where I get to.  
> 
> Look forward to seeing what you find!
Nothing obvious turning up other than we don't seem to have issue
when we aren't running hrtimers.

On a plus side I just got a report that it is effecting our d03
boards which is good on the basis I couldn't tell what the difference
could be wrt to this issue!

It indeed looks like we are consistently missing a timer before
the rcu splat occurs.

J
> 
>   Thanx, Paul
> 
> > Jonathan  
> > >   
> > > > Jonathan
> > > > 
> > > > > 
> > > > > Thomas, any debugging suggestions?
> > > > > 
> > > > >   Thanx, Paul
> > > > > 
> > > > 
> > >   
> >   
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-31 Thread Paul E. McKenney
On Mon, Jul 31, 2017 at 12:08:47PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 12:03:50 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 09:55:29 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > > Jonathan Cameron  wrote:
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the 
> > > > > interest of having
> > > > > fewer timer events.  Issue became much easier to trigger (on some 
> > > > > runs before
> > > > > I could get tracing up and running)
> > > > >e
> > > > > So logs are large enough that pastebin doesn't like them - please 
> > > > > shoet if
> > > > >>e another timer period is of interest.
> > > > > 
> > > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > > 
> > > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event 
> > > > > is
> > > > > detected around 835.
> > > > > 
> > > > > It's a lot of logs, so I haven't identified a smoking gun yet but 
> > > > > there
> > > > > may well be one in there.
> > > > 
> > > > The dmesg says:
> > > > 
> > > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > 
> > > > So I look for "rcu_preempt" timer events and find these:
> > > > 
> > > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > > timer=8017d5fc7da0
> > > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > > timer=8017d5fc7da0 function=process_timeout 
> > > > 
> > > > Next look for "8017d5fc7da0" and I don't find anything else.  
> > > It does show up off the bottom of what would fit in pastebin...
> > > 
> > >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > > timer=8017d5fc7da0
> > >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > > timer=8017d5fc7da0
> > >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > > timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> > > [timeout=1] cpu=1 idx=0 flags=  
> > 
> > Odd.  I would expect an expiration...  And ten seconds is way longer
> > than the requested one jiffy!
> > 
> > > > The timeout was one jiffy, and more than a second later, no expiration.
> > > > Is it possible that this event was lost?  I am not seeing any sign of
> > > > this is the trace.
> > > > 
> > > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > > any case).
> > > > 
> > > > The last time we saw something like this it was a timer HW/driver 
> > > > problem,
> > > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > > and SPARC.  ;-)  
> > > Could be different issues, both of which were hidden by that lockup 
> > > detector.
> > > 
> > > There is an errata work around for the timers on this particular board.
> > > I'm only vaguely aware of it, so may be unconnected.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > > 
> > > Seems unlikely though! + we've not yet seen it on the other chips that
> > > errata effects (not that that means much).  
> > 
> > If you can reproduce quickly, might be worth trying anyway...
> > 
> > Thanx, Paul
> Errata fix is running already and was for all those tests.

I was afraid of that...  ;-)

> I'll have a dig into the timers today and see where I get to.

Look forward to seeing what you find!

Thanx, Paul

> Jonathan
> > 
> > > Jonathan
> > >   
> > > > 
> > > > Thomas, any debugging suggestions?
> > > > 
> > > > Thanx, Paul
> > > >   
> > >   
> > 
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-31 Thread Jonathan Cameron
On Wed, 26 Jul 2017 16:15:05 -0700
"Paul E. McKenney"  wrote:

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" 
> > Date: Wed, 26 Jul 2017 15:36:58 -0700
> >   
> > > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > > really is having an effect.  
> > 
> > Thanks for all of the info Paul, I'll digest this and scan over the
> > code myself.
> > 
> > Just out of curiousity, what x86 idle method is your machine using?
> > The mwait one or the one which simply uses 'halt'?  The mwait variant
> > might mask this bug, and halt would be a lot closer to how sparc64 and
> > Jonathan's system operates.  
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
>   CONFIG_NO_HZ_IDLE=y
>   CONFIG_GENERIC_SMP_IDLE_THREAD=y
>   # CONFIG_IDLE_PAGE_TRACKING is not set
>   CONFIG_ACPI_PROCESSOR_IDLE=y
>   CONFIG_CPU_IDLE=y
>   # CONFIG_CPU_IDLE_GOV_LADDER is not set
>   CONFIG_CPU_IDLE_GOV_MENU=y
>   # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
>   # CONFIG_INTEL_IDLE is not set
> 
> > On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> > local TICK register keeps advancing, and the local timer therefore
> > will still trigger.  Also, any externally generated interrupts
> > (including cross calls) will wake up the cpu as well.
> > 
> > The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> > case.  One of my running theories is that we miss scheduling a tick
> > due to a race.  That would be consistent with the behavior we see
> > in the RCU dumps, I think.  
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?
> 
> > Anyways, just a theory, and that's why I keep mentioning that commit
> > about the revert of the revert (specifically
> > 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> > 
> > :-)  
> 
> I am running an overnight test in preparation for attempting to push
> some fixes for regressions into 4.12, but will try reverting this
> and enabling CONFIG_HZ_PERIODIC tomorrow.
> 
> Jonathan, might the commit that Dave points out above be what reduces
> the probability of occurrence as you test older releases?
I just got around to trying this out of curiosity.  Superficially it did
appear to possibly make the issue harder to hit took over 30 minutes
but the issue otherwise looks much the same with or without that patch.

Just out of curiosity, next thing on my list is to disable hrtimers entirely
and see what happens.

Jonathan
> 
>   Thanx, Paul
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-31 Thread Jonathan Cameron
On Fri, 28 Jul 2017 12:03:50 -0700
"Paul E. McKenney"  wrote:

> On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 09:55:29 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:  
> > > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > > Jonathan Cameron  wrote:
> > > 
> > > [ . . . ]
> > >   
> > > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest 
> > > > of having
> > > > fewer timer events.  Issue became much easier to trigger (on some runs 
> > > > before
> > > > I could get tracing up and running)
> > > >e
> > > > So logs are large enough that pastebin doesn't like them - please shoet 
> > > > if
> > > >>e another timer period is of interest.
> > > > 
> > > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > > https://pastebin.com/3w1F7amH for dmesg.  
> > > > 
> > > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > > detected around 835.
> > > > 
> > > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > > may well be one in there.
> > > 
> > > The dmesg says:
> > > 
> > > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > > RCU_GP_WAIT_FQS(3) ->state=0x1
> > > 
> > > So I look for "rcu_preempt" timer events and find these:
> > > 
> > > rcu_preempt-9 [019]    827.579114: timer_init: 
> > > timer=8017d5fc7da0
> > > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > > timer=8017d5fc7da0 function=process_timeout 
> > > 
> > > Next look for "8017d5fc7da0" and I don't find anything else.  
> > It does show up off the bottom of what would fit in pastebin...
> > 
> >  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> > timer=8017d5fc7da0
> >  rcu_preempt-9 [001]    837.681086: timer_init: 
> > timer=8017d5fc7da0
> >  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> > timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> > [timeout=1] cpu=1 idx=0 flags=  
> 
> Odd.  I would expect an expiration...  And ten seconds is way longer
> than the requested one jiffy!
> 
> > > The timeout was one jiffy, and more than a second later, no expiration.
> > > Is it possible that this event was lost?  I am not seeing any sign of
> > > this is the trace.
> > > 
> > > I don't see any sign of CPU hotplug (and I test with lots of that in
> > > any case).
> > > 
> > > The last time we saw something like this it was a timer HW/driver problem,
> > > but it is a bit hard to imagine such a problem affecting both ARM64
> > > and SPARC.  ;-)  
> > Could be different issues, both of which were hidden by that lockup 
> > detector.
> > 
> > There is an errata work around for the timers on this particular board.
> > I'm only vaguely aware of it, so may be unconnected.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> > 
> > Seems unlikely though! + we've not yet seen it on the other chips that
> > errata effects (not that that means much).  
> 
> If you can reproduce quickly, might be worth trying anyway...
> 
>   Thanx, Paul
Errata fix is running already and was for all those tests.

I'll have a dig into the timers today and see where I get to.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Thomas, any debugging suggestions?
> > > 
> > >   Thanx, Paul
> > >   
> >   
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-30 Thread Paul E. McKenney
On Sun, Jul 30, 2017 at 09:37:47PM +0800, Boqun Feng wrote:
> On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > 
> > [ . . . ]
> > 
> > > > Even though Jonathan's testing indicates that it didn't fix this
> > > > particular problem:
> > > > 
> > > > Acked-by: Paul E. McKenney 
> > > 
> > > And while we are at it:
> > > 
> > > Tested-by: Paul E. McKenney 
> > 
> > Not because it it fixed the TREE01 issue -- it did not.  But as near
> > as I can see, it didn't cause any additional issues.
> > 
> 
> Understood.
> 
> Still work on waketorture for a test case which could trigger this
> problem in a real world. My old plan is to send this out when I could
> use waketorture to show this patch actually resolves some potential
> bugs, but just put it ahead here in case it may help.
> 
> Will send it out with your Tested-by and Acked-by and continue to work
> on waketorture.

Sounds good!

Given that Jonathan's traces didn't show a timer expiration, the problems
might be in timers.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-30 Thread Boqun Feng
On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> 
> [ . . . ]
> 
> > > Even though Jonathan's testing indicates that it didn't fix this
> > > particular problem:
> > > 
> > > Acked-by: Paul E. McKenney 
> > 
> > And while we are at it:
> > 
> > Tested-by: Paul E. McKenney 
> 
> Not because it it fixed the TREE01 issue -- it did not.  But as near
> as I can see, it didn't cause any additional issues.
> 

Understood.

Still work on waketorture for a test case which could trigger this
problem in a real world. My old plan is to send this out when I could
use waketorture to show this patch actually resolves some potential
bugs, but just put it ahead here in case it may help.

Will send it out with your Tested-by and Acked-by and continue to work
on waketorture.

Regards,
Boqun

>   Thanx, Paul
> 


signature.asc
Description: PGP signature


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Boqun Feng
On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > > Hi Jonathan,
> > > 
> > > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > > 
> > >   https://marc.info/?l=linux-kernel=149750022019663
> > > 
> > > and RCU begins to use swait/wake last year, so I thought this could be
> > > relevant.
> > > 
> > > Could you try the following patch and see if it works? Thanks.
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > -->8
> > > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> > >  swake_up*()
> > > 
> > > Steven Rostedt reported a potential race in RCU core because of
> > > swake_up():
> > > 
> > > CPU0CPU1
> > > 
> > > __call_rcu_core() {
> > > 
> > >  spin_lock(rnp_root)
> > >  need_wake = __rcu_start_gp() {
> > >   rcu_start_gp_advanced() {
> > >gp_flags = FLAG_INIT
> > >   }
> > >  }
> > > 
> > >  rcu_gp_kthread() {
> > >swait_event_interruptible(wq,
> > > gp_flags & FLAG_INIT) {
> > 
> > So the idea is that we get the old value of ->gp_flags here, correct?
> > 

Yes.

> > >spin_lock(q->lock)
> > > 
> > > *fetch wq->task_list here! *
> > 
> > And the above fetch is really part of the swait_active() called out
> > below, right?
> > 

Right.

> > >list_add(wq->task_list, q->task_list)
> > >spin_unlock(q->lock);
> > > 
> > >*fetch old value of gp_flags here *
> > 
> > And here we fetch the old value of ->gp_flags again, this time under
> > the lock, right?
> > 

Hmm.. a bit different, this time is still lockless but *after* the wait
enqueued itself. We could rely on the spin_lock(q->lock) above to pair
with a spin_unlock() from another lock critical section of accessing
the wait queue(typically from some waker). But in the case Steven came
up, there is an lockless accessing to the wait queue from the waker, so
such a pair doesn't exist, which end up that the waker sees a empty wait
queue and do nothing, while the waiter still observes the old value
after its enqueue and goes to sleep.

> > >  spin_unlock(rnp_root)
> > > 
> > >  rcu_gp_kthread_wake() {
> > >   swake_up(wq) {
> > >swait_active(wq) {
> > > list_empty(wq->task_list)
> > > 
> > >} * return false *
> > > 
> > >   if (condition) * false *
> > > schedule();
> > > 
> > > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > > waits for a long time.
> > > 
> > > The reason of this is that we do a lockless swait_active() check in
> > > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > > before swait_active() to provide the proper order or 2) simply remove
> > > the swait_active() in swake_up().
> > > 
> > > The solution 2 not only fixes this problem but also keeps the swait and
> > > wait API as close as possible, as wake_up() doesn't provide a full
> > > barrier and doesn't do a lockless check of the wait queue either.
> > > Moreover, there are users already using swait_active() to do their quick
> > > checks for the wait queues, so it make less sense that swake_up() and
> > > swake_up_all() do this on their own.
> > > 
> > > This patch then removes the lockless swait_active() check in swake_up()
> > > and swake_up_all().
> > > 
> > > Reported-by: Steven Rostedt 
> > > Signed-off-by: Boqun Feng 
> > 
> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney 
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney 
> 

Thanks!

Regards,
Boqun

> > > ---
> > >  kernel/sched/swait.c | 6 --
> > >  1 file changed, 6 deletions(-)
> > > 
> > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > > index 3d5610dcce11..2227e183e202 100644
> > > --- a/kernel/sched/swait.c
> > > +++ b/kernel/sched/swait.c
> > > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> > >  {
> > >   unsigned long flags;
> > > 
> > > - if (!swait_active(q))
> > > - return;
> > > -
> > >   raw_spin_lock_irqsave(>lock, flags);
> > >   swake_up_locked(q);
> > >   raw_spin_unlock_irqrestore(>lock, flags);
> > > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> > >   struct swait_queue *curr;
> > >   LIST_HEAD(tmp);
> > > 
> > > - if (!swait_active(q))
> > > 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Paul E. McKenney
On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:

[ . . . ]

> > Even though Jonathan's testing indicates that it didn't fix this
> > particular problem:
> > 
> > Acked-by: Paul E. McKenney 
> 
> And while we are at it:
> 
> Tested-by: Paul E. McKenney 

Not because it it fixed the TREE01 issue -- it did not.  But as near
as I can see, it didn't cause any additional issues.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Paul E. McKenney
On Fri, Jul 28, 2017 at 06:27:05PM +0100, Jonathan Cameron wrote:
> On Fri, 28 Jul 2017 09:55:29 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > > On Fri, 28 Jul 2017 08:44:11 +0100
> > > Jonathan Cameron  wrote:  
> > 
> > [ . . . ]
> > 
> > > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of 
> > > having
> > > fewer timer events.  Issue became much easier to trigger (on some runs 
> > > before
> > > I could get tracing up and running)
> > >e
> > > So logs are large enough that pastebin doesn't like them - please shoet 
> > > if  
> > >>e another timer period is of interest.  
> > > 
> > > https://pastebin.com/iUZDfQGM for the timer trace.
> > > https://pastebin.com/3w1F7amH for dmesg.  
> > > 
> > > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > > detected around 835.
> > > 
> > > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > > may well be one in there.  
> > 
> > The dmesg says:
> > 
> > rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> > RCU_GP_WAIT_FQS(3) ->state=0x1
> > 
> > So I look for "rcu_preempt" timer events and find these:
> > 
> > rcu_preempt-9 [019]    827.579114: timer_init: 
> > timer=8017d5fc7da0
> > rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> > timer=8017d5fc7da0 function=process_timeout 
> > 
> > Next look for "8017d5fc7da0" and I don't find anything else.
> It does show up off the bottom of what would fit in pastebin...
> 
>  rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
> timer=8017d5fc7da0
>  rcu_preempt-9 [001]    837.681086: timer_init: 
> timer=8017d5fc7da0
>  rcu_preempt-9 [001] d..1   837.681087: timer_start: 
> timer=8017d5fc7da0 function=process_timeout expires=4295101298 
> [timeout=1] cpu=1 idx=0 flags=

Odd.  I would expect an expiration...  And ten seconds is way longer
than the requested one jiffy!

> > The timeout was one jiffy, and more than a second later, no expiration.
> > Is it possible that this event was lost?  I am not seeing any sign of
> > this is the trace.
> > 
> > I don't see any sign of CPU hotplug (and I test with lots of that in
> > any case).
> > 
> > The last time we saw something like this it was a timer HW/driver problem,
> > but it is a bit hard to imagine such a problem affecting both ARM64
> > and SPARC.  ;-)
> Could be different issues, both of which were hidden by that lockup detector.
> 
> There is an errata work around for the timers on this particular board.
> I'm only vaguely aware of it, so may be unconnected.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb
> 
> Seems unlikely though! + we've not yet seen it on the other chips that
> errata effects (not that that means much).

If you can reproduce quickly, might be worth trying anyway...

Thanx, Paul

> Jonathan
> 
> > 
> > Thomas, any debugging suggestions?
> > 
> > Thanx, Paul
> > 
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread David Miller
From: Boqun Feng 
Date: Fri, 28 Jul 2017 20:54:16 +0800

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
>   https://marc.info/?l=linux-kernel=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.

Just FYI I'm testing this patch now on sparc64...


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Paul E. McKenney
On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > Hi Jonathan,
> > 
> > FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> > 
> > https://marc.info/?l=linux-kernel=149750022019663
> > 
> > and RCU begins to use swait/wake last year, so I thought this could be
> > relevant.
> > 
> > Could you try the following patch and see if it works? Thanks.
> > 
> > Regards,
> > Boqun
> > 
> > -->8
> > Subject: [PATCH] swait: Remove the lockless swait_active() check in
> >  swake_up*()
> > 
> > Steven Rostedt reported a potential race in RCU core because of
> > swake_up():
> > 
> > CPU0CPU1
> > 
> > __call_rcu_core() {
> > 
> >  spin_lock(rnp_root)
> >  need_wake = __rcu_start_gp() {
> >   rcu_start_gp_advanced() {
> >gp_flags = FLAG_INIT
> >   }
> >  }
> > 
> >  rcu_gp_kthread() {
> >swait_event_interruptible(wq,
> > gp_flags & FLAG_INIT) {
> 
> So the idea is that we get the old value of ->gp_flags here, correct?
> 
> >spin_lock(q->lock)
> > 
> > *fetch wq->task_list here! *
> 
> And the above fetch is really part of the swait_active() called out
> below, right?
> 
> >list_add(wq->task_list, q->task_list)
> >spin_unlock(q->lock);
> > 
> >*fetch old value of gp_flags here *
> 
> And here we fetch the old value of ->gp_flags again, this time under
> the lock, right?
> 
> >  spin_unlock(rnp_root)
> > 
> >  rcu_gp_kthread_wake() {
> >   swake_up(wq) {
> >swait_active(wq) {
> > list_empty(wq->task_list)
> > 
> >} * return false *
> > 
> >   if (condition) * false *
> > schedule();
> > 
> > In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> > waits for a long time.
> > 
> > The reason of this is that we do a lockless swait_active() check in
> > swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> > before swait_active() to provide the proper order or 2) simply remove
> > the swait_active() in swake_up().
> > 
> > The solution 2 not only fixes this problem but also keeps the swait and
> > wait API as close as possible, as wake_up() doesn't provide a full
> > barrier and doesn't do a lockless check of the wait queue either.
> > Moreover, there are users already using swait_active() to do their quick
> > checks for the wait queues, so it make less sense that swake_up() and
> > swake_up_all() do this on their own.
> > 
> > This patch then removes the lockless swait_active() check in swake_up()
> > and swake_up_all().
> > 
> > Reported-by: Steven Rostedt 
> > Signed-off-by: Boqun Feng 
> 
> Even though Jonathan's testing indicates that it didn't fix this
> particular problem:
> 
> Acked-by: Paul E. McKenney 

And while we are at it:

Tested-by: Paul E. McKenney 

> > ---
> >  kernel/sched/swait.c | 6 --
> >  1 file changed, 6 deletions(-)
> > 
> > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > index 3d5610dcce11..2227e183e202 100644
> > --- a/kernel/sched/swait.c
> > +++ b/kernel/sched/swait.c
> > @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
> >  {
> > unsigned long flags;
> > 
> > -   if (!swait_active(q))
> > -   return;
> > -
> > raw_spin_lock_irqsave(>lock, flags);
> > swake_up_locked(q);
> > raw_spin_unlock_irqrestore(>lock, flags);
> > @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
> > struct swait_queue *curr;
> > LIST_HEAD(tmp);
> > 
> > -   if (!swait_active(q))
> > -   return;
> > -
> > raw_spin_lock_irq(>lock);
> > list_splice_init(>task_list, );
> > while (!list_empty()) {
> > -- 
> > 2.13.0
> > 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Jonathan Cameron
On Fri, 28 Jul 2017 09:55:29 -0700
"Paul E. McKenney"  wrote:

> On Fri, Jul 28, 2017 at 02:24:03PM +0100, Jonathan Cameron wrote:
> > On Fri, 28 Jul 2017 08:44:11 +0100
> > Jonathan Cameron  wrote:  
> 
> [ . . . ]
> 
> > Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of 
> > having
> > fewer timer events.  Issue became much easier to trigger (on some runs 
> > before
> > I could get tracing up and running)
> >e
> > So logs are large enough that pastebin doesn't like them - please shoet if  
> >>e another timer period is of interest.  
> > 
> > https://pastebin.com/iUZDfQGM for the timer trace.
> > https://pastebin.com/3w1F7amH for dmesg.  
> > 
> > The relevant timeout on the RCU stall detector was 8 seconds.  Event is
> > detected around 835.
> > 
> > It's a lot of logs, so I haven't identified a smoking gun yet but there
> > may well be one in there.  
> 
> The dmesg says:
> 
> rcu_preempt kthread starved for 2508 jiffies! g112 c111 f0x0 
> RCU_GP_WAIT_FQS(3) ->state=0x1
> 
> So I look for "rcu_preempt" timer events and find these:
> 
> rcu_preempt-9 [019]    827.579114: timer_init: timer=8017d5fc7da0
> rcu_preempt-9 [019] d..1   827.579115: timer_start: 
> timer=8017d5fc7da0 function=process_timeout 
> 
> Next look for "8017d5fc7da0" and I don't find anything else.
It does show up off the bottom of what would fit in pastebin...

 rcu_preempt-9 [001] d..1   837.681077: timer_cancel: 
timer=8017d5fc7da0
 rcu_preempt-9 [001]    837.681086: timer_init: 
timer=8017d5fc7da0
 rcu_preempt-9 [001] d..1   837.681087: timer_start: 
timer=8017d5fc7da0 function=process_timeout expires=4295101298 [timeout=1] 
cpu=1 idx=0 flags=

> The timeout was one jiffy, and more than a second later, no expiration.
> Is it possible that this event was lost?  I am not seeing any sign of
> this is the trace.
> 
> I don't see any sign of CPU hotplug (and I test with lots of that in
> any case).
> 
> The last time we saw something like this it was a timer HW/driver problem,
> but it is a bit hard to imagine such a problem affecting both ARM64
> and SPARC.  ;-)
Could be different issues, both of which were hidden by that lockup detector.

There is an errata work around for the timers on this particular board.
I'm only vaguely aware of it, so may be unconnected.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/clocksource/arm_arch_timer.c?h=v4.13-rc2=bb42ca47401010fc02901b5e8f79e40a26f208cb

Seems unlikely though! + we've not yet seen it on the other chips that
errata effects (not that that means much).

Jonathan

> 
> Thomas, any debugging suggestions?
> 
>   Thanx, Paul
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Paul E. McKenney
On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
>   https://marc.info/?l=linux-kernel=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
> 
> Regards,
> Boqun
> 
> -->8
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
> CPU0CPU1
> 
> __call_rcu_core() {
> 
>  spin_lock(rnp_root)
>  need_wake = __rcu_start_gp() {
>   rcu_start_gp_advanced() {
>gp_flags = FLAG_INIT
>   }
>  }
> 
>  rcu_gp_kthread() {
>swait_event_interruptible(wq,
> gp_flags & FLAG_INIT) {

So the idea is that we get the old value of ->gp_flags here, correct?

>spin_lock(q->lock)
> 
> *fetch wq->task_list here! *

And the above fetch is really part of the swait_active() called out
below, right?

>list_add(wq->task_list, q->task_list)
>spin_unlock(q->lock);
> 
>*fetch old value of gp_flags here *

And here we fetch the old value of ->gp_flags again, this time under
the lock, right?

>  spin_unlock(rnp_root)
> 
>  rcu_gp_kthread_wake() {
>   swake_up(wq) {
>swait_active(wq) {
> list_empty(wq->task_list)
> 
>} * return false *
> 
>   if (condition) * false *
> schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt 
> Signed-off-by: Boqun Feng 

Even though Jonathan's testing indicates that it didn't fix this
particular problem:

Acked-by: Paul E. McKenney 

> ---
>  kernel/sched/swait.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>   unsigned long flags;
> 
> - if (!swait_active(q))
> - return;
> -
>   raw_spin_lock_irqsave(>lock, flags);
>   swake_up_locked(q);
>   raw_spin_unlock_irqrestore(>lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>   struct swait_queue *curr;
>   LIST_HEAD(tmp);
> 
> - if (!swait_active(q))
> - return;
> -
>   raw_spin_lock_irq(>lock);
>   list_splice_init(>task_list, );
>   while (!list_empty()) {
> -- 
> 2.13.0
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Jonathan Cameron
On Fri, 28 Jul 2017 08:44:11 +0100
Jonathan Cameron  wrote:

> On Thu, 27 Jul 2017 09:52:45 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:  
> > > On Thu, 27 Jul 2017 14:49:03 +0100
> > > Jonathan Cameron  wrote:
> > > 
> > > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > > "Paul E. McKenney"  wrote:
> > > > > > 
> > > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote: 
> > > > > > >
> > > > > > 
> > > > > > > > Indeed, that really wouldn't explain how we end up with a RCU 
> > > > > > > > stall
> > > > > > > > dump listing almost all of the cpus as having missed a grace 
> > > > > > > > period.  
> > > > > > > 
> > > > > > > I have seen stranger things, but admittedly not often.
> > > > > > 
> > > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > > 
> > > > > > Are you sure that it's timeout has expired and it's not being 
> > > > > > scheduled,
> > > > > > or could it be a bad (large) timeout (looks unlikely) or that it's 
> > > > > > being
> > > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > > 
> > > > > > It's not in R state, so if it's not being scheduled at all, then 
> > > > > > it's
> > > > > > because the timer has not fired:
> > > > > 
> > > > > Good point, Nick!
> > > > > 
> > > > > Jonathan, could you please reproduce collecting timer event tracing?  
> > > > > 
> > > > I'm a little new to tracing (only started playing with it last week)
> > > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > > splats on reading out the trace when running with the RCU stall timer
> > > > set to 4 so have increased that back to the default and am rerunning.
> > > > 
> > > > This may take a while.  Correct me if I've gotten this wrong to save 
> > > > time
> > > > 
> > > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > > 
> > > > when it dumps, just send you the relevant part of what is in
> > > > /sys/kernel/debug/tracing/trace?
> > > 
> > > Interestingly the only thing that can make trip for me with tracing on
> > > is peaking in the tracing buffers.  Not sure this is a valid case or
> > > not.
> > > 
> > > Anyhow all timer activity seems to stop around the area of interest.
> > > 
> > > 

Firstly sorry to those who got the rather silly length email a minute ago.
It bounced on the list (fair enough - I was just being lazy on getting
data past our firewalls).

Ok.  Some info.  I disabled a few driver (usb and SAS) in the interest of having
fewer timer events.  Issue became much easier to trigger (on some runs before
I could get tracing up and running)

So logs are large enough that pastebin doesn't like them - please shout if
another timer period is of interest.

https://pastebin.com/iUZDfQGM for the timer trace.
https://pastebin.com/3w1F7amH for dmesg.  

The relevant timeout on the RCU stall detector was 8 seconds.  Event is
detected around 835.

It's a lot of logs, so I haven't identified a smoking gun yet but there
may well be one in there.

Jonathan


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Jonathan Cameron
On Fri, 28 Jul 2017 20:54:16 +0800
Boqun Feng  wrote:

> Hi Jonathan,
> 
> FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():
> 
>   https://marc.info/?l=linux-kernel=149750022019663
> 
> and RCU begins to use swait/wake last year, so I thought this could be
> relevant.
> 
> Could you try the following patch and see if it works? Thanks.
Sadly seems to be a no...  Just splatted before I could even get
the tracing set up. Back to staring at logs and hoping something
will stand out!

Jonathan
> 
> Regards,
> Boqun
> 
> -->8  
> Subject: [PATCH] swait: Remove the lockless swait_active() check in
>  swake_up*()
> 
> Steven Rostedt reported a potential race in RCU core because of
> swake_up():
> 
> CPU0CPU1
> 
> __call_rcu_core() {
> 
>  spin_lock(rnp_root)
>  need_wake = __rcu_start_gp() {
>   rcu_start_gp_advanced() {
>gp_flags = FLAG_INIT
>   }
>  }
> 
>  rcu_gp_kthread() {
>swait_event_interruptible(wq,
> gp_flags & FLAG_INIT) {
>spin_lock(q->lock)
> 
> *fetch wq->task_list here! *
> 
>list_add(wq->task_list, q->task_list)
>spin_unlock(q->lock);
> 
>*fetch old value of gp_flags here *
> 
>  spin_unlock(rnp_root)
> 
>  rcu_gp_kthread_wake() {
>   swake_up(wq) {
>swait_active(wq) {
> list_empty(wq->task_list)
> 
>} * return false *
> 
>   if (condition) * false *
> schedule();
> 
> In this case, a wakeup is missed, which could cause the rcu_gp_kthread
> waits for a long time.
> 
> The reason of this is that we do a lockless swait_active() check in
> swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
> before swait_active() to provide the proper order or 2) simply remove
> the swait_active() in swake_up().
> 
> The solution 2 not only fixes this problem but also keeps the swait and
> wait API as close as possible, as wake_up() doesn't provide a full
> barrier and doesn't do a lockless check of the wait queue either.
> Moreover, there are users already using swait_active() to do their quick
> checks for the wait queues, so it make less sense that swake_up() and
> swake_up_all() do this on their own.
> 
> This patch then removes the lockless swait_active() check in swake_up()
> and swake_up_all().
> 
> Reported-by: Steven Rostedt 
> Signed-off-by: Boqun Feng 
> ---
>  kernel/sched/swait.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> index 3d5610dcce11..2227e183e202 100644
> --- a/kernel/sched/swait.c
> +++ b/kernel/sched/swait.c
> @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
>  {
>   unsigned long flags;
>  
> - if (!swait_active(q))
> - return;
> -
>   raw_spin_lock_irqsave(>lock, flags);
>   swake_up_locked(q);
>   raw_spin_unlock_irqrestore(>lock, flags);
> @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
>   struct swait_queue *curr;
>   LIST_HEAD(tmp);
>  
> - if (!swait_active(q))
> - return;
> -
>   raw_spin_lock_irq(>lock);
>   list_splice_init(>task_list, );
>   while (!list_empty()) {



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Boqun Feng
Hi Jonathan,

FWIW, there is wakeup-missing issue in swake_up() and swake_up_all():

https://marc.info/?l=linux-kernel=149750022019663

and RCU begins to use swait/wake last year, so I thought this could be
relevant.

Could you try the following patch and see if it works? Thanks.

Regards,
Boqun

-->8
Subject: [PATCH] swait: Remove the lockless swait_active() check in
 swake_up*()

Steven Rostedt reported a potential race in RCU core because of
swake_up():

CPU0CPU1

__call_rcu_core() {

 spin_lock(rnp_root)
 need_wake = __rcu_start_gp() {
  rcu_start_gp_advanced() {
   gp_flags = FLAG_INIT
  }
 }

 rcu_gp_kthread() {
   swait_event_interruptible(wq,
gp_flags & FLAG_INIT) {
   spin_lock(q->lock)

*fetch wq->task_list here! *

   list_add(wq->task_list, q->task_list)
   spin_unlock(q->lock);

   *fetch old value of gp_flags here *

 spin_unlock(rnp_root)

 rcu_gp_kthread_wake() {
  swake_up(wq) {
   swait_active(wq) {
list_empty(wq->task_list)

   } * return false *

  if (condition) * false *
schedule();

In this case, a wakeup is missed, which could cause the rcu_gp_kthread
waits for a long time.

The reason of this is that we do a lockless swait_active() check in
swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
before swait_active() to provide the proper order or 2) simply remove
the swait_active() in swake_up().

The solution 2 not only fixes this problem but also keeps the swait and
wait API as close as possible, as wake_up() doesn't provide a full
barrier and doesn't do a lockless check of the wait queue either.
Moreover, there are users already using swait_active() to do their quick
checks for the wait queues, so it make less sense that swake_up() and
swake_up_all() do this on their own.

This patch then removes the lockless swait_active() check in swake_up()
and swake_up_all().

Reported-by: Steven Rostedt 
Signed-off-by: Boqun Feng 
---
 kernel/sched/swait.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index 3d5610dcce11..2227e183e202 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
 {
unsigned long flags;
 
-   if (!swait_active(q))
-   return;
-
raw_spin_lock_irqsave(>lock, flags);
swake_up_locked(q);
raw_spin_unlock_irqrestore(>lock, flags);
@@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
struct swait_queue *curr;
LIST_HEAD(tmp);
 
-   if (!swait_active(q))
-   return;
-
raw_spin_lock_irq(>lock);
list_splice_init(>task_list, );
while (!list_empty()) {
-- 
2.13.0



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-28 Thread Jonathan Cameron
On Thu, 27 Jul 2017 09:52:45 -0700
"Paul E. McKenney"  wrote:

> On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> > On Thu, 27 Jul 2017 14:49:03 +0100
> > Jonathan Cameron  wrote:
> >   
> > > On Thu, 27 Jul 2017 05:49:13 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> > > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > >   
> > > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:  
> > > > >   
> > > > > > > Indeed, that really wouldn't explain how we end up with a RCU 
> > > > > > > stall
> > > > > > > dump listing almost all of the cpus as having missed a grace 
> > > > > > > period.
> > > > > > 
> > > > > > I have seen stranger things, but admittedly not often.  
> > > > > 
> > > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > > 
> > > > > Are you sure that it's timeout has expired and it's not being 
> > > > > scheduled,
> > > > > or could it be a bad (large) timeout (looks unlikely) or that it's 
> > > > > being
> > > > > scheduled but not correctly noting gps on other CPUs?
> > > > > 
> > > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > > because the timer has not fired:  
> > > > 
> > > > Good point, Nick!
> > > > 
> > > > Jonathan, could you please reproduce collecting timer event tracing?
> > > I'm a little new to tracing (only started playing with it last week)
> > > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > > splats on reading out the trace when running with the RCU stall timer
> > > set to 4 so have increased that back to the default and am rerunning.
> > > 
> > > This may take a while.  Correct me if I've gotten this wrong to save time
> > > 
> > > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > > 
> > > when it dumps, just send you the relevant part of what is in
> > > /sys/kernel/debug/tracing/trace?  
> > 
> > Interestingly the only thing that can make trip for me with tracing on
> > is peaking in the tracing buffers.  Not sure this is a valid case or
> > not.
> > 
> > Anyhow all timer activity seems to stop around the area of interest.
> > 
> > 
> > [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [ 9442.419107]  1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 
> > fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> > [ 9442.430224]  3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 
> > fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> > [ 9442.441340]  4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 
> > fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> > [ 9442.452456]  5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 
> > fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.463572]  6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 
> > fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> > [ 9442.474688]  7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 
> > fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> > [ 9442.485803]  8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 
> > fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> > [ 9442.496571]  9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 
> > fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> > [ 9442.507339]  10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 
> > fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> > [ 9442.518281]  11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 
> > fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> > [ 9442.529136]  12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 
> > fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> > [ 9442.539992]  13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 
> > fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> > [ 9442.551020]  14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 
> > fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> > [ 9442.561875]  15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 
> > fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> > [ 9442.572730]  17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 
> > fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> > [ 9442.583759]  18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 
> > fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> > [ 9442.594787]  19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 
> > fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> > [ 9442.605816]  20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 
> > fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> > [ 9442.616758]  21-...: (2 GPs 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-27 Thread Paul E. McKenney
On Thu, Jul 27, 2017 at 05:39:23PM +0100, Jonathan Cameron wrote:
> On Thu, 27 Jul 2017 14:49:03 +0100
> Jonathan Cameron  wrote:
> 
> > On Thu, 27 Jul 2017 05:49:13 -0700
> > "Paul E. McKenney"  wrote:
> > 
> > > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> > > > 
> > > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > > dump listing almost all of the cpus as having missed a grace 
> > > > > > period.  
> > > > > 
> > > > > I have seen stranger things, but admittedly not often.
> > > > 
> > > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > > 
> > > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > > scheduled but not correctly noting gps on other CPUs?
> > > > 
> > > > It's not in R state, so if it's not being scheduled at all, then it's
> > > > because the timer has not fired:
> > > 
> > > Good point, Nick!
> > > 
> > > Jonathan, could you please reproduce collecting timer event tracing?  
> > I'm a little new to tracing (only started playing with it last week)
> > so fingers crossed I've set it up right.  No splats yet.  Was getting
> > splats on reading out the trace when running with the RCU stall timer
> > set to 4 so have increased that back to the default and am rerunning.
> > 
> > This may take a while.  Correct me if I've gotten this wrong to save time
> > 
> > echo "timer:*" > /sys/kernel/debug/tracing/set_event
> > 
> > when it dumps, just send you the relevant part of what is in
> > /sys/kernel/debug/tracing/trace?
> 
> Interestingly the only thing that can make trip for me with tracing on
> is peaking in the tracing buffers.  Not sure this is a valid case or
> not.
> 
> Anyhow all timer activity seems to stop around the area of interest.
> 
> 
> [ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 9442.419107]1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 
> fqs=0 last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
> [ 9442.430224]3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 
> fqs=0 last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
> [ 9442.441340]4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 
> fqs=0 last_accelerate: ca88/de80, nonlazy_posted: 0, L.
> [ 9442.452456]5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 
> fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.463572]6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 
> fqs=0 last_accelerate: ba62/de88, nonlazy_posted: 0, L.
> [ 9442.474688]7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 
> fqs=0 last_accelerate: b280/de88, nonlazy_posted: 0, L.
> [ 9442.485803]8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 
> fqs=0 last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
> [ 9442.496571]9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 
> fqs=0 last_accelerate: 6030/de88, nonlazy_posted: 0, L.
> [ 9442.507339]10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 
> fqs=0 last_accelerate: 5487/de88, nonlazy_posted: 0, L.
> [ 9442.518281]11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 
> fqs=0 last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
> [ 9442.529136]12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 
> fqs=0 last_accelerate: 6010/de99, nonlazy_posted: 0, L.
> [ 9442.539992]13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 
> fqs=0 last_accelerate: b280/de99, nonlazy_posted: 0, L.
> [ 9442.551020]14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 
> fqs=0 last_accelerate: 6030/de99, nonlazy_posted: 0, L.
> [ 9442.561875]15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 
> fqs=0 last_accelerate: c208/de99, nonlazy_posted: 0, L.
> [ 9442.572730]17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 
> fqs=0 last_accelerate: ca88/de99, nonlazy_posted: 0, L.
> [ 9442.583759]18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 
> fqs=0 last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
> [ 9442.594787]19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 
> fqs=0 last_accelerate: b280/dea7, nonlazy_posted: 0, L.
> [ 9442.605816]20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 
> fqs=0 last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
> [ 9442.616758]21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 
> fqs=0 last_accelerate: b599/deb2, nonlazy_posted: 0, L.
> [ 9442.627786]22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 
> fqs=0 last_accelerate: c604/deb2, nonlazy_posted: 0, L.
> [ 9442.638641]23-...: (1 GPs behind) idle=488/0/0 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-27 Thread Jonathan Cameron
On Thu, 27 Jul 2017 14:49:03 +0100
Jonathan Cameron  wrote:

> On Thu, 27 Jul 2017 05:49:13 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:  
> > > On Wed, 26 Jul 2017 18:42:14 -0700
> > > "Paul E. McKenney"  wrote:
> > > 
> > > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> > > 
> > > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > > dump listing almost all of the cpus as having missed a grace period.  
> > > > > 
> > > > 
> > > > I have seen stranger things, but admittedly not often.
> > > 
> > > So the backtraces show the RCU gp thread in schedule_timeout.
> > > 
> > > Are you sure that it's timeout has expired and it's not being scheduled,
> > > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > > scheduled but not correctly noting gps on other CPUs?
> > > 
> > > It's not in R state, so if it's not being scheduled at all, then it's
> > > because the timer has not fired:
> > 
> > Good point, Nick!
> > 
> > Jonathan, could you please reproduce collecting timer event tracing?  
> I'm a little new to tracing (only started playing with it last week)
> so fingers crossed I've set it up right.  No splats yet.  Was getting
> splats on reading out the trace when running with the RCU stall timer
> set to 4 so have increased that back to the default and am rerunning.
> 
> This may take a while.  Correct me if I've gotten this wrong to save time
> 
> echo "timer:*" > /sys/kernel/debug/tracing/set_event
> 
> when it dumps, just send you the relevant part of what is in
> /sys/kernel/debug/tracing/trace?

Interestingly the only thing that can make trip for me with tracing on
is peaking in the tracing buffers.  Not sure this is a valid case or
not.

Anyhow all timer activity seems to stop around the area of interest.


[ 9442.413624] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 9442.419107]  1-...: (1 GPs behind) idle=844/0/0 softirq=27747/27755 fqs=0 
last_accelerate: dd6a/de80, nonlazy_posted: 0, L.
[ 9442.430224]  3-...: (2 GPs behind) idle=8f8/0/0 softirq=32197/32198 fqs=0 
last_accelerate: 29b1/de80, nonlazy_posted: 0, L.
[ 9442.441340]  4-...: (7 GPs behind) idle=740/0/0 softirq=22351/22352 fqs=0 
last_accelerate: ca88/de80, nonlazy_posted: 0, L.
[ 9442.452456]  5-...: (2 GPs behind) idle=9b0/0/0 softirq=21315/21319 fqs=0 
last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.463572]  6-...: (2 GPs behind) idle=794/0/0 softirq=19699/19707 fqs=0 
last_accelerate: ba62/de88, nonlazy_posted: 0, L.
[ 9442.474688]  7-...: (2 GPs behind) idle=ac4/0/0 softirq=22547/22554 fqs=0 
last_accelerate: b280/de88, nonlazy_posted: 0, L.
[ 9442.485803]  8-...: (9 GPs behind) idle=118/0/0 softirq=281/291 fqs=0 
last_accelerate: c3fe/de88, nonlazy_posted: 0, L.
[ 9442.496571]  9-...: (9 GPs behind) idle=8fc/0/0 softirq=284/292 fqs=0 
last_accelerate: 6030/de88, nonlazy_posted: 0, L.
[ 9442.507339]  10-...: (14 GPs behind) idle=f78/0/0 softirq=254/254 fqs=0 
last_accelerate: 5487/de88, nonlazy_posted: 0, L.
[ 9442.518281]  11-...: (9 GPs behind) idle=c9c/0/0 softirq=301/308 fqs=0 
last_accelerate: 3d3e/de99, nonlazy_posted: 0, L.
[ 9442.529136]  12-...: (9 GPs behind) idle=4a4/0/0 softirq=735/737 fqs=0 
last_accelerate: 6010/de99, nonlazy_posted: 0, L.
[ 9442.539992]  13-...: (9 GPs behind) idle=34c/0/0 softirq=1121/1131 fqs=0 
last_accelerate: b280/de99, nonlazy_posted: 0, L.
[ 9442.551020]  14-...: (9 GPs behind) idle=2f4/0/0 softirq=707/713 fqs=0 
last_accelerate: 6030/de99, nonlazy_posted: 0, L.
[ 9442.561875]  15-...: (2 GPs behind) idle=b30/0/0 softirq=821/976 fqs=0 
last_accelerate: c208/de99, nonlazy_posted: 0, L.
[ 9442.572730]  17-...: (2 GPs behind) idle=5a8/0/0 softirq=1456/1565 fqs=0 
last_accelerate: ca88/de99, nonlazy_posted: 0, L.
[ 9442.583759]  18-...: (2 GPs behind) idle=2e4/0/0 softirq=1923/1936 fqs=0 
last_accelerate: ca88/dea7, nonlazy_posted: 0, L.
[ 9442.594787]  19-...: (2 GPs behind) idle=138/0/0 softirq=1421/1432 fqs=0 
last_accelerate: b280/dea7, nonlazy_posted: 0, L.
[ 9442.605816]  20-...: (50 GPs behind) idle=634/0/0 softirq=217/219 fqs=0 
last_accelerate: c96f/dea7, nonlazy_posted: 0, L.
[ 9442.616758]  21-...: (2 GPs behind) idle=eb8/0/0 softirq=1368/1369 fqs=0 
last_accelerate: b599/deb2, nonlazy_posted: 0, L.
[ 9442.627786]  22-...: (1 GPs behind) idle=aa8/0/0 softirq=229/232 fqs=0 
last_accelerate: c604/deb2, nonlazy_posted: 0, L.
[ 9442.638641]  23-...: (1 GPs behind) idle=488/0/0 softirq=247/248 fqs=0 
last_accelerate: c600/deb2, nonlazy_posted: 0, L.
[ 9442.649496]  24-...: (33 GPs behind) idle=f7c/0/0 softirq=319/319 fqs=0 
last_accelerate: 5290/deb2, nonlazy_posted: 0, L.
[ 9442.660437]  25-...: (33 GPs behind) idle=944/0/0 softirq=308/308 fqs=0 
last_accelerate: 52c0/deb2, nonlazy_posted: 0, L.
[ 9442.671379]  26-...: (9 GPs behind) 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-27 Thread Jonathan Cameron
On Thu, 27 Jul 2017 05:49:13 -0700
"Paul E. McKenney"  wrote:

> On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> > On Wed, 26 Jul 2017 18:42:14 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:  
> >   
> > > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > > dump listing almost all of the cpus as having missed a grace period.
> > > 
> > > I have seen stranger things, but admittedly not often.  
> > 
> > So the backtraces show the RCU gp thread in schedule_timeout.
> > 
> > Are you sure that it's timeout has expired and it's not being scheduled,
> > or could it be a bad (large) timeout (looks unlikely) or that it's being
> > scheduled but not correctly noting gps on other CPUs?
> > 
> > It's not in R state, so if it's not being scheduled at all, then it's
> > because the timer has not fired:  
> 
> Good point, Nick!
> 
> Jonathan, could you please reproduce collecting timer event tracing?
I'm a little new to tracing (only started playing with it last week)
so fingers crossed I've set it up right.  No splats yet.  Was getting
splats on reading out the trace when running with the RCU stall timer
set to 4 so have increased that back to the default and am rerunning.

This may take a while.  Correct me if I've gotten this wrong to save time

echo "timer:*" > /sys/kernel/debug/tracing/set_event

when it dumps, just send you the relevant part of what is in
/sys/kernel/debug/tracing/trace?

Thanks,

Jonathan
> 
>   Thanx, Paul
> 
> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 
> > f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > [ 1984.638153] rcu_preempt S0 9  2 0x
> > [ 1984.643626] Call trace:
> > [ 1984.646059] [] __switch_to+0x90/0xa8
> > [ 1984.651189] [] __schedule+0x19c/0x5d8
> > [ 1984.656400] [] schedule+0x38/0xa0
> > [ 1984.661266] [] schedule_timeout+0x124/0x218
> > [ 1984.667002] [] rcu_gp_kthread+0x4fc/0x748
> > [ 1984.672564] [] kthread+0xfc/0x128
> > [ 1984.677429] [] ret_from_fork+0x10/0x50
> >   
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-27 Thread Paul E. McKenney
On Thu, Jul 27, 2017 at 02:34:00PM +1000, Nicholas Piggin wrote:
> On Wed, 26 Jul 2017 18:42:14 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> 
> > > Indeed, that really wouldn't explain how we end up with a RCU stall
> > > dump listing almost all of the cpus as having missed a grace period.  
> > 
> > I have seen stranger things, but admittedly not often.
> 
> So the backtraces show the RCU gp thread in schedule_timeout.
> 
> Are you sure that it's timeout has expired and it's not being scheduled,
> or could it be a bad (large) timeout (looks unlikely) or that it's being
> scheduled but not correctly noting gps on other CPUs?
> 
> It's not in R state, so if it's not being scheduled at all, then it's
> because the timer has not fired:

Good point, Nick!

Jonathan, could you please reproduce collecting timer event tracing?

Thanx, Paul

> [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 
> RCU_GP_WAIT_FQS(3) ->state=0x1
> [ 1984.638153] rcu_preempt S0 9  2 0x
> [ 1984.643626] Call trace:
> [ 1984.646059] [] __switch_to+0x90/0xa8
> [ 1984.651189] [] __schedule+0x19c/0x5d8
> [ 1984.656400] [] schedule+0x38/0xa0
> [ 1984.661266] [] schedule_timeout+0x124/0x218
> [ 1984.667002] [] rcu_gp_kthread+0x4fc/0x748
> [ 1984.672564] [] kthread+0xfc/0x128
> [ 1984.677429] [] ret_from_fork+0x10/0x50
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-27 Thread Jonathan Cameron
On Wed, 26 Jul 2017 18:13:12 +0100
Jonathan Cameron  wrote:

> On Wed, 26 Jul 2017 09:54:32 -0700
> David Miller  wrote:
> 
> > From: "Paul E. McKenney" 
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> >   
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!
> 
> Not sure it actually gives us much information, but no issues yet
> with a simple program running every cpu that wakes up every 3 seconds.
> 
> Will leave it running overnight and report back in the morning.
Perhaps unsurprisingly the above test didn't show any splats.

So it appears a userspace wakeup is enough to stop the issue happening
(or at least make it a lot less likely).

Jonathan
> 
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?  
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Nicholas Piggin
On Wed, 26 Jul 2017 18:42:14 -0700
"Paul E. McKenney"  wrote:

> On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:

> > Indeed, that really wouldn't explain how we end up with a RCU stall
> > dump listing almost all of the cpus as having missed a grace period.  
> 
> I have seen stranger things, but admittedly not often.

So the backtraces show the RCU gp thread in schedule_timeout.

Are you sure that it's timeout has expired and it's not being scheduled,
or could it be a bad (large) timeout (looks unlikely) or that it's being
scheduled but not correctly noting gps on other CPUs?

It's not in R state, so if it's not being scheduled at all, then it's
because the timer has not fired:

[ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 f0x0 
RCU_GP_WAIT_FQS(3) ->state=0x1
[ 1984.638153] rcu_preempt S0 9  2 0x
[ 1984.643626] Call trace:
[ 1984.646059] [] __switch_to+0x90/0xa8
[ 1984.651189] [] __schedule+0x19c/0x5d8
[ 1984.656400] [] schedule+0x38/0xa0
[ 1984.661266] [] schedule_timeout+0x124/0x218
[ 1984.667002] [] rcu_gp_kthread+0x4fc/0x748
[ 1984.672564] [] kthread+0xfc/0x128
[ 1984.677429] [] ret_from_fork+0x10/0x50


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 04:22:00PM -0700, David Miller wrote:
> From: "Paul E. McKenney" 
> Date: Wed, 26 Jul 2017 16:15:05 -0700
> 
> > On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> >> Just out of curiousity, what x86 idle method is your machine using?
> >> The mwait one or the one which simply uses 'halt'?  The mwait variant
> >> might mask this bug, and halt would be a lot closer to how sparc64 and
> >> Jonathan's system operates.
> > 
> > My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> > I am not using the mwait one.  Here is a grep for IDLE in my .config:
> > 
> > CONFIG_NO_HZ_IDLE=y
> > CONFIG_GENERIC_SMP_IDLE_THREAD=y
> > # CONFIG_IDLE_PAGE_TRACKING is not set
> > CONFIG_ACPI_PROCESSOR_IDLE=y
> > CONFIG_CPU_IDLE=y
> > # CONFIG_CPU_IDLE_GOV_LADDER is not set
> > CONFIG_CPU_IDLE_GOV_MENU=y
> > # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
> > # CONFIG_INTEL_IDLE is not set
> 
> No, that doesn't influence it.  It is determined by cpu features at
> run time.
> 
> If you are using mwait, it'll say so in your kernel log like this:
> 
>   using mwait in idle threads

Thank you for the hint!

And vim says:

"E486: Pattern not found: using mwait in idle threads"

> >> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> >> local TICK register keeps advancing, and the local timer therefore
> >> will still trigger.  Also, any externally generated interrupts
> >> (including cross calls) will wake up the cpu as well.
> >> 
> >> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> >> case.  One of my running theories is that we miss scheduling a tick
> >> due to a race.  That would be consistent with the behavior we see
> >> in the RCU dumps, I think.
> > 
> > But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> > warning?  By default, your grace period needs to extend for more than
> > 21 seconds (more than one-third of a -minute-) to get one.  Or do
> > you mean that the ticks get shut off now and forever, as opposed to
> > just losing one of them?
> 
> Hmmm, good point.  And I was talking about simply missing one tick.
> 
> Indeed, that really wouldn't explain how we end up with a RCU stall
> dump listing almost all of the cpus as having missed a grace period.

I have seen stranger things, but admittedly not often.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread David Miller
From: "Paul E. McKenney" 
Date: Wed, 26 Jul 2017 16:15:05 -0700

> On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
>> Just out of curiousity, what x86 idle method is your machine using?
>> The mwait one or the one which simply uses 'halt'?  The mwait variant
>> might mask this bug, and halt would be a lot closer to how sparc64 and
>> Jonathan's system operates.
> 
> My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
> I am not using the mwait one.  Here is a grep for IDLE in my .config:
> 
>   CONFIG_NO_HZ_IDLE=y
>   CONFIG_GENERIC_SMP_IDLE_THREAD=y
>   # CONFIG_IDLE_PAGE_TRACKING is not set
>   CONFIG_ACPI_PROCESSOR_IDLE=y
>   CONFIG_CPU_IDLE=y
>   # CONFIG_CPU_IDLE_GOV_LADDER is not set
>   CONFIG_CPU_IDLE_GOV_MENU=y
>   # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
>   # CONFIG_INTEL_IDLE is not set

No, that doesn't influence it.  It is determined by cpu features at
run time.

If you are using mwait, it'll say so in your kernel log like this:

using mwait in idle threads

>> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
>> local TICK register keeps advancing, and the local timer therefore
>> will still trigger.  Also, any externally generated interrupts
>> (including cross calls) will wake up the cpu as well.
>> 
>> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
>> case.  One of my running theories is that we miss scheduling a tick
>> due to a race.  That would be consistent with the behavior we see
>> in the RCU dumps, I think.
> 
> But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
> warning?  By default, your grace period needs to extend for more than
> 21 seconds (more than one-third of a -minute-) to get one.  Or do
> you mean that the ticks get shut off now and forever, as opposed to
> just losing one of them?

Hmmm, good point.  And I was talking about simply missing one tick.

Indeed, that really wouldn't explain how we end up with a RCU stall
dump listing almost all of the cpus as having missed a grace period.


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 03:45:40PM -0700, David Miller wrote:
> From: "Paul E. McKenney" 
> Date: Wed, 26 Jul 2017 15:36:58 -0700
> 
> > And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> > CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> > really is having an effect.
> 
> Thanks for all of the info Paul, I'll digest this and scan over the
> code myself.
> 
> Just out of curiousity, what x86 idle method is your machine using?
> The mwait one or the one which simply uses 'halt'?  The mwait variant
> might mask this bug, and halt would be a lot closer to how sparc64 and
> Jonathan's system operates.

My kernel builds with CONFIG_INTEL_IDLE=n, which I believe means that
I am not using the mwait one.  Here is a grep for IDLE in my .config:

CONFIG_NO_HZ_IDLE=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_CPU_IDLE=y
# CONFIG_CPU_IDLE_GOV_LADDER is not set
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
# CONFIG_INTEL_IDLE is not set

> On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
> local TICK register keeps advancing, and the local timer therefore
> will still trigger.  Also, any externally generated interrupts
> (including cross calls) will wake up the cpu as well.
> 
> The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
> case.  One of my running theories is that we miss scheduling a tick
> due to a race.  That would be consistent with the behavior we see
> in the RCU dumps, I think.

But wouldn't you have to miss a -lot- of ticks to get an RCU CPU stall
warning?  By default, your grace period needs to extend for more than
21 seconds (more than one-third of a -minute-) to get one.  Or do
you mean that the ticks get shut off now and forever, as opposed to
just losing one of them?

> Anyways, just a theory, and that's why I keep mentioning that commit
> about the revert of the revert (specifically
> 411fe24e6b7c283c3a1911450cdba6dd3aaea56e).
> 
> :-)

I am running an overnight test in preparation for attempting to push
some fixes for regressions into 4.12, but will try reverting this
and enabling CONFIG_HZ_PERIODIC tomorrow.

Jonathan, might the commit that Dave points out above be what reduces
the probability of occurrence as you test older releases?

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread David Miller
From: "Paul E. McKenney" 
Date: Wed, 26 Jul 2017 15:36:58 -0700

> And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
> CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
> really is having an effect.

Thanks for all of the info Paul, I'll digest this and scan over the
code myself.

Just out of curiousity, what x86 idle method is your machine using?
The mwait one or the one which simply uses 'halt'?  The mwait variant
might mask this bug, and halt would be a lot closer to how sparc64 and
Jonathan's system operates.

On sparc64 the cpu yield we do in the idle loop sleeps the cpu.  It's
local TICK register keeps advancing, and the local timer therefore
will still trigger.  Also, any externally generated interrupts
(including cross calls) will wake up the cpu as well.

The tick-sched code is really tricky wrt. NO_HZ even in the NO_HZ_IDLE
case.  One of my running theories is that we miss scheduling a tick
due to a race.  That would be consistent with the behavior we see
in the RCU dumps, I think.

Anyways, just a theory, and that's why I keep mentioning that commit
about the revert of the revert (specifically
411fe24e6b7c283c3a1911450cdba6dd3aaea56e).

:-)


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 10:50:13AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> > From: "Paul E. McKenney" 
> > Date: Wed, 26 Jul 2017 08:49:00 -0700
> > 
> > > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> > >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> > >> an hour to occur.
> > > 
> > > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > > are just greatly reducing the probability of the problem rather than
> > > completely preventing it.
> > > 
> > > Still, hopefully useful information, thank you for the testing!
> > 
> > I guess that invalidates my idea to test reverting recent changes to
> > the tick-sched.c code... :-/
> > 
> > In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> > idle system?
> > 
> > All the cpus enter the idle loop, have no timers programmed, and they
> > all just go to sleep until an external event happens.
> > 
> > What ensures that grace periods get processed in this regime?
> 
> There are several different situations with different mechanisms:
> 
> 1.No grace period is in progress and no RCU callbacks are pending
>   anywhere in the system.  In this case, some other event would
>   need to start a grace period, so RCU just stays idle until that
>   happens, possibly indefinitely.  According to the battery-powered
>   embedded guys, this is a feature, not a bug.  ;-)
> 
> 2.No grace period is in progress, but there is at least one RCU
>   callback somewhere in the system.  In this case, the mechanism
>   depends on CONFIG_RCU_FAST_NO_HZ:
> 
>   CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
>   queued will return "true" in response to the call to
>   rcu_needs_cpu() that is made shortly before that CPU
>   enters idle.  This will cause the scheduling-clock
>   interrupt to remain on, despite the CPU being idle,
>   which will in turn allow RCU's state machine to continue
>   running out of softirq, triggered by the scheduling-clock
>   interrupts.
> 
>   CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
>   will return "false" in response to the call to
>   rcu_needs_cpu() that is made shortly before that CPU
>   enters idle.  However, it will also request a next event
>   about six seconds in the future if all callbacks do
>   nothing but free memory (kfree_rcu()), or about four
>   jiffies in the future if at least one callback does
>   something more than just free memory.
> 
>   There is also a rcu_prepare_for_idle() function that
>   is invoked later in the idle-entry process in this case
>   which will wake up the grace-period kthread if need be.
> 
> 3.A grace period is in progress.  In this case the grace-period
>   kthread is either currently running (in which case there will be
>   at least one non-idle CPU) or is in a timed wait for its next
>   scan for idle/offline CPUs (such CPUs need the grace-period
>   kthread to report quiescent states on their behalf).  In this
>   latter case, the timer subsystem will post a next event that
>   will be the wakeup time for the grace-period kthread, or some
>   earlier event.
> 
>   This is where we have been seeing trouble, if for no other
>   reason because RCU CPU stall warnings only happen when there
>   is a grace period in progress.
> 
> That is the theory, anyway...
> 
> And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
> I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
> saw RCU CPU stall warnings with starvation of the grace-period kthread.
> I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
> to see if it makes a significance difference for my testing.  I do have
> CONFIG_RCU_FAST_NO_HZ=y in my runs.

And without CONFIG_SOFTLOCKUP_DETECTOR, I see five runs of 24 with RCU
CPU stall warnings.  So it seems likely that CONFIG_SOFTLOCKUP_DETECTOR
really is having an effect.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 09:54:32AM -0700, David Miller wrote:
> From: "Paul E. McKenney" 
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!
> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?

There are several different situations with different mechanisms:

1.  No grace period is in progress and no RCU callbacks are pending
anywhere in the system.  In this case, some other event would
need to start a grace period, so RCU just stays idle until that
happens, possibly indefinitely.  According to the battery-powered
embedded guys, this is a feature, not a bug.  ;-)

2.  No grace period is in progress, but there is at least one RCU
callback somewhere in the system.  In this case, the mechanism
depends on CONFIG_RCU_FAST_NO_HZ:

CONFIG_RCU_FAST_NO_HZ=n:  The CPU on which the callback is
queued will return "true" in response to the call to
rcu_needs_cpu() that is made shortly before that CPU
enters idle.  This will cause the scheduling-clock
interrupt to remain on, despite the CPU being idle,
which will in turn allow RCU's state machine to continue
running out of softirq, triggered by the scheduling-clock
interrupts.

CONFIG_RCU_FAST_NO_HZ=y:  The CPU on which the callback is queued
will return "false" in response to the call to
rcu_needs_cpu() that is made shortly before that CPU
enters idle.  However, it will also request a next event
about six seconds in the future if all callbacks do
nothing but free memory (kfree_rcu()), or about four
jiffies in the future if at least one callback does
something more than just free memory.

There is also a rcu_prepare_for_idle() function that
is invoked later in the idle-entry process in this case
which will wake up the grace-period kthread if need be.

3.  A grace period is in progress.  In this case the grace-period
kthread is either currently running (in which case there will be
at least one non-idle CPU) or is in a timed wait for its next
scan for idle/offline CPUs (such CPUs need the grace-period
kthread to report quiescent states on their behalf).  In this
latter case, the timer subsystem will post a next event that
will be the wakeup time for the grace-period kthread, or some
earlier event.

This is where we have been seeing trouble, if for no other
reason because RCU CPU stall warnings only happen when there
is a grace period in progress.

That is the theory, anyway...

And when I enabled CONFIG_SOFTLOCKUP_DETECTOR, I still see failures.
I did 24 half-hour rcutorture runs on the TREE01 scenario, and two of them
saw RCU CPU stall warnings with starvation of the grace-period kthread.
I just now started another test but without CONFIG_SOFTLOCKUP_DETECTOR
to see if it makes a significance difference for my testing.  I do have
CONFIG_RCU_FAST_NO_HZ=y in my runs.

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 09:54:32 -0700
David Miller  wrote:

> From: "Paul E. McKenney" 
> Date: Wed, 26 Jul 2017 08:49:00 -0700
> 
> > On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:  
> >> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> >> an hour to occur.  
> > 
> > And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> > are just greatly reducing the probability of the problem rather than
> > completely preventing it.
> > 
> > Still, hopefully useful information, thank you for the testing!  

Not sure it actually gives us much information, but no issues yet
with a simple program running every cpu that wakes up every 3 seconds.

Will leave it running overnight and report back in the morning.

> 
> I guess that invalidates my idea to test reverting recent changes to
> the tick-sched.c code... :-/
> 
> In NO_HZ_IDLE mode, what is really supposed to happen on a completely
> idle system?
> 
> All the cpus enter the idle loop, have no timers programmed, and they
> all just go to sleep until an external event happens.
> 
> What ensures that grace periods get processed in this regime?


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread David Miller
From: "Paul E. McKenney" 
Date: Wed, 26 Jul 2017 08:49:00 -0700

> On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
>> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
>> an hour to occur.
> 
> And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
> are just greatly reducing the probability of the problem rather than
> completely preventing it.
> 
> Still, hopefully useful information, thank you for the testing!

I guess that invalidates my idea to test reverting recent changes to
the tick-sched.c code... :-/

In NO_HZ_IDLE mode, what is really supposed to happen on a completely
idle system?

All the cpus enter the idle loop, have no timers programmed, and they
all just go to sleep until an external event happens.

What ensures that grace periods get processed in this regime?


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread David Miller
From: "Paul E. McKenney" 
Date: Wed, 26 Jul 2017 07:14:17 -0700

> Dave, any other ideas on what might be causing this or what might be
> tested?

I was going to go through the changes that happened between v4.12
and now to kernel/timer/tick-sched.c and see if reverting any of
those help.

But I don't know when I would get to that.

Commit 411fe24e6b7c283c3a1911450cdba6dd3aaea56e, looks particularly juicy.
:-)


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 04:33:40PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 15:23:15 +0100
> Jonathan Cameron  wrote:
> 
> > On Wed, 26 Jul 2017 07:14:17 -0700
> > "Paul E. McKenney"  wrote:
> > 
> > > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > > Jonathan Cameron  wrote:
> > > > 
> > > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > > Jonathan Cameron  wrote:
> > > > > 
> > > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > > "Paul E. McKenney"  wrote:
> > > > > >   
> > > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote: 
> > > > > > >
> > > > > > > > From: "Paul E. McKenney" 
> > > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > > >   
> > > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote: 
> > > > > > > > >  
> > > > > > > > >> Just to report, turning softlockup back on fixes things for 
> > > > > > > > >> me on
> > > > > > > > >> sparc64 too.  
> > > > > > > > > 
> > > > > > > > > Very good!
> > > > > > > > >   
> > > > > > > > >> The thing about softlockup is it runs an hrtimer, which 
> > > > > > > > >> seems to run
> > > > > > > > >> about every 4 seconds.  
> > > > > > > > > 
> > > > > > > > > I could see where that could shake things loose, but I am 
> > > > > > > > > surprised that
> > > > > > > > > it would be needed.  I ran a short run with 
> > > > > > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > > >   
> > > > > > > > >> So I wonder if this is a NO_HZ problem.  
> > > > > > > > > 
> > > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  
> > > > > > > > > What are
> > > > > > > > > you running?  (Again, my symptoms are slightly different, so 
> > > > > > > > > I might
> > > > > > > > > be seeing a different bug.)  
> > > > > > > > 
> > > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > > 
> > > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR 
> > > > > > > > disabled.  
> > > > > > > 
> > > > > > > Same here -- but my failure case happens fairly rarely, so it 
> > > > > > > will take
> > > > > > > some time to gain reasonable confidence that enabling 
> > > > > > > SOFTLOCKUP_DETECTOR
> > > > > > > had effect.
> > > > > > > 
> > > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > > 
> > > > > > >   Thanx, Paul
> > > > > > > 
> > > > > > I'll be the headless chicken running around and trying as many tests
> > > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > > minutes so we'll see how far we get.
> > > > > > 
> > > > > > Make me a list to run if you like ;)
> > > > > > 
> > > > > > NO_HZ_PERIODIC=y running now.  
> > > > > By which I mean CONFIG_HZ_PERIODIC=y
> > > 
> > > I did get that messed up, didn't I?  Sorry for my confusion!
> > >   
> > > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > > I won't have much confidence until we are a few hours in on this.
> > > > > 
> > > > > Anyhow, certainly looking like a promising direction for 
> > > > > investigation!
> > > > > 
> > > > Well it's done over 3 hours without a splat so I think it is fine with
> > > > CONFIG_HZ_PERIODIC=y
> > > 
> > > Thank you!
> > > 
> > > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > > user task waking up every few seconds on each CPU, does the problem occur?
> > > (The question is whether any disturbance gets things going, or whether 
> > > there
> > > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > > 
> > > Dave, any other ideas on what might be causing this or what might be
> > > tested?
> > > 
> > >   Thanx, Paul
> > >   
> > 
> > Although it's still early days (40 mins in) it looks like the issue first
> > occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> > 
> > Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
> 
> Didn't leave it long enough. Still bad on 4.10-rc7 just took over
> an hour to occur.

And it is quite possible that SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y
are just greatly reducing the probability of the problem rather than
completely preventing it.

Still, hopefully useful information, thank you for the testing!


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 15:23:15 +0100
Jonathan Cameron  wrote:

> On Wed, 26 Jul 2017 07:14:17 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:  
> > > On Wed, 26 Jul 2017 10:32:32 +0100
> > > Jonathan Cameron  wrote:
> > > 
> > > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > > Jonathan Cameron  wrote:
> > > > 
> > > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > >   
> > > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:   
> > > > > >  
> > > > > > > From: "Paul E. McKenney" 
> > > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > >   
> > > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:   
> > > > > > > >
> > > > > > > >> Just to report, turning softlockup back on fixes things for me 
> > > > > > > >> on
> > > > > > > >> sparc64 too.  
> > > > > > > > 
> > > > > > > > Very good!
> > > > > > > >   
> > > > > > > >> The thing about softlockup is it runs an hrtimer, which seems 
> > > > > > > >> to run
> > > > > > > >> about every 4 seconds.  
> > > > > > > > 
> > > > > > > > I could see where that could shake things loose, but I am 
> > > > > > > > surprised that
> > > > > > > > it would be needed.  I ran a short run with 
> > > > > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > >   
> > > > > > > >> So I wonder if this is a NO_HZ problem.  
> > > > > > > > 
> > > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  
> > > > > > > > What are
> > > > > > > > you running?  (Again, my symptoms are slightly different, so I 
> > > > > > > > might
> > > > > > > > be seeing a different bug.)  
> > > > > > > 
> > > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > > 
> > > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR 
> > > > > > > disabled.  
> > > > > > 
> > > > > > Same here -- but my failure case happens fairly rarely, so it will 
> > > > > > take
> > > > > > some time to gain reasonable confidence that enabling 
> > > > > > SOFTLOCKUP_DETECTOR
> > > > > > had effect.
> > > > > > 
> > > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > > 
> > > > > > Thanx, Paul
> > > > > > 
> > > > > I'll be the headless chicken running around and trying as many tests
> > > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > > minutes so we'll see how far we get.
> > > > > 
> > > > > Make me a list to run if you like ;)
> > > > > 
> > > > > NO_HZ_PERIODIC=y running now.  
> > > > By which I mean CONFIG_HZ_PERIODIC=y
> > 
> > I did get that messed up, didn't I?  Sorry for my confusion!
> >   
> > > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > > I won't have much confidence until we are a few hours in on this.
> > > > 
> > > > Anyhow, certainly looking like a promising direction for investigation!
> > > > 
> > > Well it's done over 3 hours without a splat so I think it is fine with
> > > CONFIG_HZ_PERIODIC=y
> > 
> > Thank you!
> > 
> > If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> > user task waking up every few seconds on each CPU, does the problem occur?
> > (The question is whether any disturbance gets things going, or whether there
> > is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> > 
> > Dave, any other ideas on what might be causing this or what might be
> > tested?
> > 
> > Thanx, Paul
> >   
> 
> Although it's still early days (40 mins in) it looks like the issue first
> occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)
> 
> Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.
Didn't leave it long enough. Still bad on 4.10-rc7 just took over
an hour to occur.
> 
> Could be something different was hiding it in 4.10 though.  We have a fair
> delta from mainline back then unfortunately so bisecting will be
> 'interesting'.
> 
> I'll see if I can get the test you suggest running.
> 
> Jonathan
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 07:14:17 -0700
"Paul E. McKenney"  wrote:

> On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> > On Wed, 26 Jul 2017 10:32:32 +0100
> > Jonathan Cameron  wrote:
> >   
> > > On Wed, 26 Jul 2017 09:16:23 +0100
> > > Jonathan Cameron  wrote:
> > >   
> > > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > > > > From: "Paul E. McKenney" 
> > > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > > 
> > > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote: 
> > > > > > >
> > > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > > >> sparc64 too.
> > > > > > > 
> > > > > > > Very good!
> > > > > > > 
> > > > > > >> The thing about softlockup is it runs an hrtimer, which seems to 
> > > > > > >> run
> > > > > > >> about every 4 seconds.
> > > > > > > 
> > > > > > > I could see where that could shake things loose, but I am 
> > > > > > > surprised that
> > > > > > > it would be needed.  I ran a short run with 
> > > > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > > 
> > > > > > >> So I wonder if this is a NO_HZ problem.
> > > > > > > 
> > > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What 
> > > > > > > are
> > > > > > > you running?  (Again, my symptoms are slightly different, so I 
> > > > > > > might
> > > > > > > be seeing a different bug.)
> > > > > > 
> > > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > > 
> > > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled. 
> > > > > >
> > > > > 
> > > > > Same here -- but my failure case happens fairly rarely, so it will 
> > > > > take
> > > > > some time to gain reasonable confidence that enabling 
> > > > > SOFTLOCKUP_DETECTOR
> > > > > had effect.
> > > > > 
> > > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > > 
> > > > >   Thanx, Paul
> > > > >   
> > > > I'll be the headless chicken running around and trying as many tests
> > > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > > minutes so we'll see how far we get.
> > > > 
> > > > Make me a list to run if you like ;)
> > > > 
> > > > NO_HZ_PERIODIC=y running now.
> > > By which I mean CONFIG_HZ_PERIODIC=y  
> 
> I did get that messed up, didn't I?  Sorry for my confusion!
> 
> > > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > > I won't have much confidence until we are a few hours in on this.
> > > 
> > > Anyhow, certainly looking like a promising direction for investigation!
> > >   
> > Well it's done over 3 hours without a splat so I think it is fine with
> > CONFIG_HZ_PERIODIC=y  
> 
> Thank you!
> 
> If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
> user task waking up every few seconds on each CPU, does the problem occur?
> (The question is whether any disturbance gets things going, or whether there
> is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.
> 
> Dave, any other ideas on what might be causing this or what might be
> tested?
> 
>   Thanx, Paul
> 

Although it's still early days (40 mins in) it looks like the issue first
occurred between 4.10-rc7 and 4.11-rc1 (don't ask why those particular RCs)

Bad as with current kernel on 4.11-rc1 and good on 4.10-rc7.

Could be something different was hiding it in 4.10 though.  We have a fair
delta from mainline back then unfortunately so bisecting will be
'interesting'.

I'll see if I can get the test you suggest running.

Jonathan


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 01:28:01PM +0100, Jonathan Cameron wrote:
> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron  wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron  wrote:
> > 
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > > > > From: "Paul E. McKenney" 
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > >   
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.  
> > > > > > 
> > > > > > Very good!
> > > > > >   
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to 
> > > > > >> run
> > > > > >> about every 4 seconds.  
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised 
> > > > > > that
> > > > > > it would be needed.  I ran a short run with 
> > > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > >   
> > > > > >> So I wonder if this is a NO_HZ problem.  
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What 
> > > > > > are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)  
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.   
> > > > >
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling 
> > > > SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > Thanx, Paul
> > > > 
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.  
> > By which I mean CONFIG_HZ_PERIODIC=y

I did get that messed up, didn't I?  Sorry for my confusion!

> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> > 
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y

Thank you!

If you run with SOFTLOCKUP_DETECTOR=n and NO_HZ_IDLE=y, but have a normal
user task waking up every few seconds on each CPU, does the problem occur?
(The question is whether any disturbance gets things going, or whether there
is something special about SOFTLOCKUP_DETECTOR=y and HZ_PERIODIC=y.

Dave, any other ideas on what might be causing this or what might be
tested?

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 13:28:01 +0100
Jonathan Cameron  wrote:

> On Wed, 26 Jul 2017 10:32:32 +0100
> Jonathan Cameron  wrote:
> 
> > On Wed, 26 Jul 2017 09:16:23 +0100
> > Jonathan Cameron  wrote:
> >   
> > > On Tue, 25 Jul 2017 21:12:17 -0700
> > > "Paul E. McKenney"  wrote:
> > > 
> > > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > > > From: "Paul E. McKenney" 
> > > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > > > 
> > > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:   
> > > > > >  
> > > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > > >> sparc64 too.
> > > > > > 
> > > > > > Very good!
> > > > > > 
> > > > > >> The thing about softlockup is it runs an hrtimer, which seems to 
> > > > > >> run
> > > > > >> about every 4 seconds.
> > > > > > 
> > > > > > I could see where that could shake things loose, but I am surprised 
> > > > > > that
> > > > > > it would be needed.  I ran a short run with 
> > > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > with no trouble, but I will be running a longer test later on.
> > > > > > 
> > > > > >> So I wonder if this is a NO_HZ problem.
> > > > > > 
> > > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What 
> > > > > > are
> > > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > > be seeing a different bug.)
> > > > > 
> > > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > > 
> > > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.   
> > > > >  
> > > > 
> > > > Same here -- but my failure case happens fairly rarely, so it will take
> > > > some time to gain reasonable confidence that enabling 
> > > > SOFTLOCKUP_DETECTOR
> > > > had effect.
> > > > 
> > > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > > 
> > > > Thanx, Paul
> > > >   
> > > I'll be the headless chicken running around and trying as many tests
> > > as I can fit in.  Typical time to see the failure for us is sub 10
> > > minutes so we'll see how far we get.
> > > 
> > > Make me a list to run if you like ;)
> > > 
> > > NO_HZ_PERIODIC=y running now.
> > By which I mean CONFIG_HZ_PERIODIC=y
> > 
> > Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> > on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> > I won't have much confidence until we are a few hours in on this.
> > 
> > Anyhow, certainly looking like a promising direction for investigation!
> >   
> Well it's done over 3 hours without a splat so I think it is fine with
> CONFIG_HZ_PERIODIC=y
> 
As I think we expected, the problem occurs with NO_HZ_FULL.
Happened pretty quickly but given the somewhat random nature,
might just be coincidence.

Jonathan
> 
> > Jonathan
> >   
> > > 
> > > Jonathan
> > > 
> > > ___
> > > linuxarm mailing list
> > > linux...@huawei.com
> > > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
> > 
> > 
> > ___
> > linuxarm mailing list
> > linux...@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm




Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 10:32:32 +0100
Jonathan Cameron  wrote:

> On Wed, 26 Jul 2017 09:16:23 +0100
> Jonathan Cameron  wrote:
> 
> > On Tue, 25 Jul 2017 21:12:17 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > > > From: "Paul E. McKenney" 
> > > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > >   
> > > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > > > >> Just to report, turning softlockup back on fixes things for me on
> > > > >> sparc64 too.  
> > > > > 
> > > > > Very good!
> > > > >   
> > > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > > >> about every 4 seconds.  
> > > > > 
> > > > > I could see where that could shake things loose, but I am surprised 
> > > > > that
> > > > > it would be needed.  I ran a short run with 
> > > > > CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > with no trouble, but I will be running a longer test later on.
> > > > >   
> > > > >> So I wonder if this is a NO_HZ problem.  
> > > > > 
> > > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > > be seeing a different bug.)  
> > > > 
> > > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > > 
> > > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled. 
> > > >  
> > > 
> > > Same here -- but my failure case happens fairly rarely, so it will take
> > > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > > had effect.
> > > 
> > > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > > 
> > >   Thanx, Paul
> > > 
> > I'll be the headless chicken running around and trying as many tests
> > as I can fit in.  Typical time to see the failure for us is sub 10
> > minutes so we'll see how far we get.
> > 
> > Make me a list to run if you like ;)
> > 
> > NO_HZ_PERIODIC=y running now.  
> By which I mean CONFIG_HZ_PERIODIC=y
> 
> Anyhow, run for 40 minutes with out seeing a splat but my sanity check
> on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
> I won't have much confidence until we are a few hours in on this.
> 
> Anyhow, certainly looking like a promising direction for investigation!
> 
Well it's done over 3 hours without a splat so I think it is fine with
CONFIG_HZ_PERIODIC=y


> Jonathan
> 
> > 
> > Jonathan
> > 
> > ___
> > linuxarm mailing list
> > linux...@huawei.com
> > http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm  
> 
> 
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm




Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Wed, 26 Jul 2017 09:16:23 +0100
Jonathan Cameron  wrote:

> On Tue, 25 Jul 2017 21:12:17 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:  
> > > From: "Paul E. McKenney" 
> > > Date: Tue, 25 Jul 2017 20:55:45 -0700
> > > 
> > > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> > > >> Just to report, turning softlockup back on fixes things for me on
> > > >> sparc64 too.
> > > > 
> > > > Very good!
> > > > 
> > > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > > >> about every 4 seconds.
> > > > 
> > > > I could see where that could shake things loose, but I am surprised that
> > > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > with no trouble, but I will be running a longer test later on.
> > > > 
> > > >> So I wonder if this is a NO_HZ problem.
> > > > 
> > > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > > you running?  (Again, my symptoms are slightly different, so I might
> > > > be seeing a different bug.)
> > > 
> > > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > > 
> > > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.
> > 
> > Same here -- but my failure case happens fairly rarely, so it will take
> > some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> > had effect.
> > 
> > But you are right, might be interesting to try NO_HZ_PERIODIC=y
> > or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> > 
> > Thanx, Paul
> >   
> I'll be the headless chicken running around and trying as many tests
> as I can fit in.  Typical time to see the failure for us is sub 10
> minutes so we'll see how far we get.
> 
> Make me a list to run if you like ;)
> 
> NO_HZ_PERIODIC=y running now.
By which I mean CONFIG_HZ_PERIODIC=y

Anyhow, run for 40 minutes with out seeing a splat but my sanity check
on the NO_FULL_HZ=n and NO_HZ_IDLE=y this morning took 20 minutes so
I won't have much confidence until we are a few hours in on this.

Anyhow, certainly looking like a promising direction for investigation!

Jonathan

> 
> Jonathan
> 
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm




Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Tue, 25 Jul 2017 21:12:17 -0700
"Paul E. McKenney"  wrote:

> On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> > From: "Paul E. McKenney" 
> > Date: Tue, 25 Jul 2017 20:55:45 -0700
> >   
> > > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:  
> > >> Just to report, turning softlockup back on fixes things for me on
> > >> sparc64 too.  
> > > 
> > > Very good!
> > >   
> > >> The thing about softlockup is it runs an hrtimer, which seems to run
> > >> about every 4 seconds.  
> > > 
> > > I could see where that could shake things loose, but I am surprised that
> > > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > > with no trouble, but I will be running a longer test later on.
> > >   
> > >> So I wonder if this is a NO_HZ problem.  
> > > 
> > > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > > you running?  (Again, my symptoms are slightly different, so I might
> > > be seeing a different bug.)  
> > 
> > I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> > 
> > To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.  
> 
> Same here -- but my failure case happens fairly rarely, so it will take
> some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
> had effect.
> 
> But you are right, might be interesting to try NO_HZ_PERIODIC=y
> or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)
> 
>   Thanx, Paul
> 
I'll be the headless chicken running around and trying as many tests
as I can fit in.  Typical time to see the failure for us is sub 10
minutes so we'll see how far we get.

Make me a list to run if you like ;)

NO_HZ_PERIODIC=y running now.

Jonathan



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-26 Thread Jonathan Cameron
On Tue, 25 Jul 2017 20:53:06 -0700
"Paul E. McKenney"  wrote:

> On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:  
> > > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > > Jonathan Cameron  wrote:
> > > > > >   
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > > the fairly similar but single socket / smaller core count
> > > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > > and still present in v4.13-rc2.
> > > > > > > 
> > > > > > > The symptom is:  
> > > > > 
> > > > > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> > > > > they have been seeing something similar, and you might well have saved
> > > > > them the trouble of bisecting.
> > > > > 
> > > > > [ . . . ]
> > > > > 
> > > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! 
> > > > > > > g1566 c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> > > > > 
> > > > > This is the cause from an RCU perspective.  You had a lot of idle 
> > > > > CPUs,
> > > > > and RCU is not permitted to disturb them -- the battery-powered 
> > > > > embedded
> > > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > > that each CPU updates a per-CPU state variable when entering or 
> > > > > exiting
> > > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > > message) checks these state variables, and if when sees an idle CPU,
> > > > > it reports a quiescent state on that CPU's behalf.
> > > > > 
> > > > > But the grace-period kthread can only do this work if it gets a chance
> > > > > to run.  And the message above says that this kthread hasn't had a 
> > > > > chance
> > > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > > says that grace period #1566 is in progress, the "f0x0" says that no 
> > > > > one
> > > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > > that the grace-period kthread has fully initialized the current grace
> > > > > period and is sleeping for a few jiffies waiting to scan for idle 
> > > > > tasks.
> > > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.
> > > > 
> > > > Thanks for the explanation!
> > > > > 
> > > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > > split up config options) do to prevent the grace-period kthread from
> > > > > getting a chance to run?" 
> > > > 
> > > > As far as we can tell it was a side effect of that patch.
> > > > 
> > > > The real cause is that patch changed the result of defconfigs to stop 
> > > > running
> > > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > > 
> > > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > > means we don't see the problem any more.
> > > > 
> > > > > I must confess that I don't see anything
> > > > > obvious in that commit, so my second question is "Are we sure that
> > > > > reverting this commit makes the problem go away?"
> > > > 
> > > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > > That detector fires up a thread on every cpu, which may be relevant.
> > > 
> > > Interesting...  Why should it be necessary to fire up a thread on every
> > > CPU in order to make sure that RCU's grace-period kthreads get some
> > > CPU time?  Especially give how many idle CPUs you had on your system.
> > > 
> > > So I have to ask if there is some other bug that the softlockup detector
> > > is masking.  
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)  
> 
> Looking forward to seeing results!
> 
> > > > > and my third is "Is
> > > > > this an intermittent problem that led to a false bisection?"
> > > > 
> > > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > > side of that patch and since with the option enabled on latest mainline.
> > > > 
> > > > Also can cause the issue before that patch by disabling the previous
> > > > relevant option on 4.12.
> > > 
> > > OK, thank you -- hard to argue with that!  ;-)  
> > 
> > We thought it was a pretty unlikely a bisection result
> > hence hammered it thoroughly ;)  
> 
> Glad that I am not the only paranoid one out here.  ;-)
> 
> > > Except that I am still 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Paul E. McKenney
On Tue, Jul 25, 2017 at 09:02:33PM -0700, David Miller wrote:
> From: "Paul E. McKenney" 
> Date: Tue, 25 Jul 2017 20:55:45 -0700
> 
> > On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> >> Just to report, turning softlockup back on fixes things for me on
> >> sparc64 too.
> > 
> > Very good!
> > 
> >> The thing about softlockup is it runs an hrtimer, which seems to run
> >> about every 4 seconds.
> > 
> > I could see where that could shake things loose, but I am surprised that
> > it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> > with no trouble, but I will be running a longer test later on.
> > 
> >> So I wonder if this is a NO_HZ problem.
> > 
> > Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> > you running?  (Again, my symptoms are slightly different, so I might
> > be seeing a different bug.)
> 
> I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.
> 
> To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.

Same here -- but my failure case happens fairly rarely, so it will take
some time to gain reasonable confidence that enabling SOFTLOCKUP_DETECTOR
had effect.

But you are right, might be interesting to try NO_HZ_PERIODIC=y
or NO_HZ_FULL=y.  So many possible tests, and so little time.  ;-)

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread David Miller
From: "Paul E. McKenney" 
Date: Tue, 25 Jul 2017 20:55:45 -0700

> On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
>> Just to report, turning softlockup back on fixes things for me on
>> sparc64 too.
> 
> Very good!
> 
>> The thing about softlockup is it runs an hrtimer, which seems to run
>> about every 4 seconds.
> 
> I could see where that could shake things loose, but I am surprised that
> it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
> with no trouble, but I will be running a longer test later on.
> 
>> So I wonder if this is a NO_HZ problem.
> 
> Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
> you running?  (Again, my symptoms are slightly different, so I might
> be seeing a different bug.)

I run with NO_HZ_FULL=n and NO_HZ_IDLE=y, just like you.

To clarify, the symptoms show up with SOFTLOCKUP_DETECTOR disabled.


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Paul E. McKenney
On Tue, Jul 25, 2017 at 02:10:29PM -0700, David Miller wrote:
> From: Jonathan Cameron 
> Date: Wed, 26 Jul 2017 00:52:07 +0800
> 
> > On Tue, 25 Jul 2017 08:12:45 -0700
> > "Paul E. McKenney"  wrote:
> > 
> >> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> >> > On Tue, 25 Jul 2017 06:46:26 -0700
> >> > "Paul E. McKenney"  wrote:
> >> >   
> >> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> >> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> >> > > > Jonathan Cameron  wrote:
> >> > > > 
> >> > > > > Hi All,
> >> > > > > 
> >> > > > > We observed a regression on our d05 boards (but curiously not
> >> > > > > the fairly similar but single socket / smaller core count
> >> > > > > d03), initially seen with linux-next prior to the merge window
> >> > > > > and still present in v4.13-rc2.
> >> > > > > 
> >> > > > > The symptom is:
> >> > > 
> >> > > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> >> > > they have been seeing something similar, and you might well have saved
> >> > > them the trouble of bisecting.
> >> > > 
> >> > > [ . . . ]
> >> > >   
> >> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 
> >> > > > > c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> >> > > 
> >> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> >> > > and RCU is not permitted to disturb them -- the battery-powered 
> >> > > embedded
> >> > > guys get very annoyed by that sort of thing.  What happens instead is
> >> > > that each CPU updates a per-CPU state variable when entering or exiting
> >> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> >> > > message) checks these state variables, and if when sees an idle CPU,
> >> > > it reports a quiescent state on that CPU's behalf.
> >> > > 
> >> > > But the grace-period kthread can only do this work if it gets a chance
> >> > > to run.  And the message above says that this kthread hasn't had a 
> >> > > chance
> >> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> >> > > says that grace period #1566 is in progress, the "f0x0" says that no 
> >> > > one
> >> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> >> > > that the grace-period kthread has fully initialized the current grace
> >> > > period and is sleeping for a few jiffies waiting to scan for idle 
> >> > > tasks.
> >> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> >> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> >> > 
> >> > Thanks for the explanation!  
> >> > > 
> >> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> >> > > split up config options) do to prevent the grace-period kthread from
> >> > > getting a chance to run?"   
> >> > 
> >> > As far as we can tell it was a side effect of that patch.
> >> > 
> >> > The real cause is that patch changed the result of defconfigs to stop 
> >> > running
> >> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> >> > 
> >> > Enabling that on 4.13-rc2 (and presumably everything in between)
> >> > means we don't see the problem any more.
> >> >   
> >> > > I must confess that I don't see anything
> >> > > obvious in that commit, so my second question is "Are we sure that
> >> > > reverting this commit makes the problem go away?"  
> >> > 
> >> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> >> > That detector fires up a thread on every cpu, which may be relevant.  
> >> 
> >> Interesting...  Why should it be necessary to fire up a thread on every
> >> CPU in order to make sure that RCU's grace-period kthreads get some
> >> CPU time?  Especially give how many idle CPUs you had on your system.
> >> 
> >> So I have to ask if there is some other bug that the softlockup detector
> >> is masking.
> > I am thinking the same.  We can try going back further than 4.12 tomorrow
> > (we think we can realistically go back to 4.8 and possibly 4.6
> > with this board)
> 
> Just to report, turning softlockup back on fixes things for me on
> sparc64 too.

Very good!

> The thing about softlockup is it runs an hrtimer, which seems to run
> about every 4 seconds.

I could see where that could shake things loose, but I am surprised that
it would be needed.  I ran a short run with CONFIG_SOFTLOCKUP_DETECTOR=y
with no trouble, but I will be running a longer test later on.

> So I wonder if this is a NO_HZ problem.

Might be.  My tests run with NO_HZ_FULL=n and NO_HZ_IDLE=y.  What are
you running?  (Again, my symptoms are slightly different, so I might
be seeing a different bug.)

Thanx, Paul



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Paul E. McKenney
On Wed, Jul 26, 2017 at 12:52:07AM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > > On Tue, 25 Jul 2017 06:46:26 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > > Jonathan Cameron  wrote:
> > > > > 
> > > > > > Hi All,
> > > > > > 
> > > > > > We observed a regression on our d05 boards (but curiously not
> > > > > > the fairly similar but single socket / smaller core count
> > > > > > d03), initially seen with linux-next prior to the merge window
> > > > > > and still present in v4.13-rc2.
> > > > > > 
> > > > > > The symptom is:
> > > > 
> > > > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> > > > they have been seeing something similar, and you might well have saved
> > > > them the trouble of bisecting.
> > > > 
> > > > [ . . . ]
> > > >   
> > > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 
> > > > > > c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > > 
> > > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > > guys get very annoyed by that sort of thing.  What happens instead is
> > > > that each CPU updates a per-CPU state variable when entering or exiting
> > > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > > message) checks these state variables, and if when sees an idle CPU,
> > > > it reports a quiescent state on that CPU's behalf.
> > > > 
> > > > But the grace-period kthread can only do this work if it gets a chance
> > > > to run.  And the message above says that this kthread hasn't had a 
> > > > chance
> > > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > > that the grace-period kthread has fully initialized the current grace
> > > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > > 
> > > Thanks for the explanation!  
> > > > 
> > > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > > split up config options) do to prevent the grace-period kthread from
> > > > getting a chance to run?"   
> > > 
> > > As far as we can tell it was a side effect of that patch.
> > > 
> > > The real cause is that patch changed the result of defconfigs to stop 
> > > running
> > > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > > 
> > > Enabling that on 4.13-rc2 (and presumably everything in between)
> > > means we don't see the problem any more.
> > >   
> > > > I must confess that I don't see anything
> > > > obvious in that commit, so my second question is "Are we sure that
> > > > reverting this commit makes the problem go away?"  
> > > 
> > > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > > That detector fires up a thread on every cpu, which may be relevant.  
> > 
> > Interesting...  Why should it be necessary to fire up a thread on every
> > CPU in order to make sure that RCU's grace-period kthreads get some
> > CPU time?  Especially give how many idle CPUs you had on your system.
> > 
> > So I have to ask if there is some other bug that the softlockup detector
> > is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Looking forward to seeing results!

> > > > and my third is "Is
> > > > this an intermittent problem that led to a false bisection?"  
> > > 
> > > Whilst it is a bit slow to occur, we verified with long runs on either
> > > side of that patch and since with the option enabled on latest mainline.
> > > 
> > > Also can cause the issue before that patch by disabling the previous
> > > relevant option on 4.12.  
> > 
> > OK, thank you -- hard to argue with that!  ;-)
> 
> We thought it was a pretty unlikely a bisection result
> hence hammered it thoroughly ;)

Glad that I am not the only paranoid one out here.  ;-)

> > Except that I am still puzzled as to why per-CPU softlockup threads
> > are needed for RCU's kthreads to get their wakeups.  We really should
> > be able to disable softlockup and still have kthreads get wakeups and
> > access to CPU, after all.
> > 
> > > > [ . . . ]
> > > >   
> > > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > > but we are seeing even with the default 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Jonathan Cameron
On Tue, 25 Jul 2017 08:12:45 -0700
"Paul E. McKenney"  wrote:

> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> > On Tue, 25 Jul 2017 06:46:26 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
> > > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > > Jonathan Cameron  wrote:
> > > > 
> > > > > Hi All,
> > > > > 
> > > > > We observed a regression on our d05 boards (but curiously not
> > > > > the fairly similar but single socket / smaller core count
> > > > > d03), initially seen with linux-next prior to the merge window
> > > > > and still present in v4.13-rc2.
> > > > > 
> > > > > The symptom is:
> > > 
> > > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> > > they have been seeing something similar, and you might well have saved
> > > them the trouble of bisecting.
> > > 
> > > [ . . . ]
> > >   
> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 
> > > > > c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
> > > 
> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > > and RCU is not permitted to disturb them -- the battery-powered embedded
> > > guys get very annoyed by that sort of thing.  What happens instead is
> > > that each CPU updates a per-CPU state variable when entering or exiting
> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > > message) checks these state variables, and if when sees an idle CPU,
> > > it reports a quiescent state on that CPU's behalf.
> > > 
> > > But the grace-period kthread can only do this work if it gets a chance
> > > to run.  And the message above says that this kthread hasn't had a chance
> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > > says that grace period #1566 is in progress, the "f0x0" says that no one
> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > > that the grace-period kthread has fully initialized the current grace
> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > > Finally, the "->state=0x1" says that the grace-period kthread is in
> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
> > 
> > Thanks for the explanation!  
> > > 
> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > > split up config options) do to prevent the grace-period kthread from
> > > getting a chance to run?"   
> > 
> > As far as we can tell it was a side effect of that patch.
> > 
> > The real cause is that patch changed the result of defconfigs to stop 
> > running
> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> > 
> > Enabling that on 4.13-rc2 (and presumably everything in between)
> > means we don't see the problem any more.
> >   
> > > I must confess that I don't see anything
> > > obvious in that commit, so my second question is "Are we sure that
> > > reverting this commit makes the problem go away?"  
> > 
> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> > That detector fires up a thread on every cpu, which may be relevant.  
> 
> Interesting...  Why should it be necessary to fire up a thread on every
> CPU in order to make sure that RCU's grace-period kthreads get some
> CPU time?  Especially give how many idle CPUs you had on your system.
> 
> So I have to ask if there is some other bug that the softlockup detector
> is masking.
I am thinking the same.  We can try going back further than 4.12 tomorrow
(we think we can realistically go back to 4.8 and possibly 4.6
with this board)
> 
> > > and my third is "Is
> > > this an intermittent problem that led to a false bisection?"  
> > 
> > Whilst it is a bit slow to occur, we verified with long runs on either
> > side of that patch and since with the option enabled on latest mainline.
> > 
> > Also can cause the issue before that patch by disabling the previous
> > relevant option on 4.12.  
> 
> OK, thank you -- hard to argue with that!  ;-)
We thought it was a pretty unlikely a bisection result
hence hammered it thoroughly ;)
> 
> Except that I am still puzzled as to why per-CPU softlockup threads
> are needed for RCU's kthreads to get their wakeups.  We really should
> be able to disable softlockup and still have kthreads get wakeups and
> access to CPU, after all.
> 
> > > [ . . . ]
> > >   
> > > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > > but we are seeing even with the default value of 24 seconds.
> > > > > 
> > > > > Tends to occur after a period or relatively low usage, but has
> > > > > also been seen mid way through performance tests.
> > > > > 
> > > > > This was not seen with v4.12 so a bisection run later lead to
> > > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > > 
> > > > > Which was odd until we 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Jonathan Cameron
On Tue, 25 Jul 2017 06:46:26 -0700
"Paul E. McKenney"  wrote:

> On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > On Tue, 25 Jul 2017 19:32:10 +0800
> > Jonathan Cameron  wrote:
> >   
> > > Hi All,
> > > 
> > > We observed a regression on our d05 boards (but curiously not
> > > the fairly similar but single socket / smaller core count
> > > d03), initially seen with linux-next prior to the merge window
> > > and still present in v4.13-rc2.
> > > 
> > > The symptom is:  
> 
> Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> they have been seeing something similar, and you might well have saved
> them the trouble of bisecting.
> 
> [ . . . ]
> 
> > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 
> > > f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> 
> This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> and RCU is not permitted to disturb them -- the battery-powered embedded
> guys get very annoyed by that sort of thing.  What happens instead is
> that each CPU updates a per-CPU state variable when entering or exiting
> idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> message) checks these state variables, and if when sees an idle CPU,
> it reports a quiescent state on that CPU's behalf.
> 
> But the grace-period kthread can only do this work if it gets a chance
> to run.  And the message above says that this kthread hasn't had a chance
> to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> says that grace period #1566 is in progress, the "f0x0" says that no one
> is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> that the grace-period kthread has fully initialized the current grace
> period and is sleeping for a few jiffies waiting to scan for idle tasks.
> Finally, the "->state=0x1" says that the grace-period kthread is in
> TASK_INTERRUPTIBLE state, in other words, still sleeping.
Thanks for the explanation!
> 
> So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> split up config options) do to prevent the grace-period kthread from
> getting a chance to run?" 

As far as we can tell it was a side effect of that patch.

The real cause is that patch changed the result of defconfigs to stop running
the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR

Enabling that on 4.13-rc2 (and presumably everything in between)
means we don't see the problem any more.

> I must confess that I don't see anything
> obvious in that commit, so my second question is "Are we sure that
> reverting this commit makes the problem go away?"
Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
That detector fires up a thread on every cpu, which may be relevant.

> and my third is "Is
> this an intermittent problem that led to a false bisection?"
Whilst it is a bit slow to occur, we verified with long runs on either
side of that patch and since with the option enabled on latest mainline.

Also can cause the issue before that patch by disabling the previous
relevant option on 4.12.

> 
> [ . . . ]
> 
> > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > but we are seeing even with the default value of 24 seconds.
> > > 
> > > Tends to occur after a period or relatively low usage, but has
> > > also been seen mid way through performance tests.
> > > 
> > > This was not seen with v4.12 so a bisection run later lead to
> > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > 
> > > Which was odd until we discovered that a side effect of this patch
> > > was to change whether the softlockup detector was enabled or not in
> > > the arm64 defconfig.
> > > 
> > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > issue occur there as well.
> > > 
> > > Clearly the softlockup detector results in a thread on every cpu,
> > > which might be related but beyond that we are still looking into
> > > the issue.
> > > 
> > > So the obvious question is whether anyone else is seeing this as
> > > it might help us to focus in on where to look!  
> > 
> > Huh. Something similar has been seen very intermittently on powerpc
> > as well. We couldn't reproduce it reliably to bisect it already, so
> > this is a good help.
> > 
> > http://marc.info/?l=linuxppc-embedded=149872815523646=2
> > 
> > It looks like the watchdog patch has a similar effect on powerpc in
> > that it stops enabling the softlockup detector by default. Haven't
> > confirmed, but it looks like the same thing.
> > 
> > A bug in RCU stall detection?  
> 
> Well, if I am expected to make grace periods complete when my grace-period
> kthreads aren't getting any CPU time, I will have to make some substantial
> changes.  ;-)
> 
> One possibility is that the timer isn't firing and another is that the
> timer's wakeup is being lost 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread David Miller
From: Jonathan Cameron 
Date: Wed, 26 Jul 2017 00:52:07 +0800

> On Tue, 25 Jul 2017 08:12:45 -0700
> "Paul E. McKenney"  wrote:
> 
>> On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
>> > On Tue, 25 Jul 2017 06:46:26 -0700
>> > "Paul E. McKenney"  wrote:
>> >   
>> > > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:  
>> > > > On Tue, 25 Jul 2017 19:32:10 +0800
>> > > > Jonathan Cameron  wrote:
>> > > > 
>> > > > > Hi All,
>> > > > > 
>> > > > > We observed a regression on our d05 boards (but curiously not
>> > > > > the fairly similar but single socket / smaller core count
>> > > > > d03), initially seen with linux-next prior to the merge window
>> > > > > and still present in v4.13-rc2.
>> > > > > 
>> > > > > The symptom is:
>> > > 
>> > > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
>> > > they have been seeing something similar, and you might well have saved
>> > > them the trouble of bisecting.
>> > > 
>> > > [ . . . ]
>> > >   
>> > > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 
>> > > > > c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
>> > > 
>> > > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
>> > > and RCU is not permitted to disturb them -- the battery-powered embedded
>> > > guys get very annoyed by that sort of thing.  What happens instead is
>> > > that each CPU updates a per-CPU state variable when entering or exiting
>> > > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
>> > > message) checks these state variables, and if when sees an idle CPU,
>> > > it reports a quiescent state on that CPU's behalf.
>> > > 
>> > > But the grace-period kthread can only do this work if it gets a chance
>> > > to run.  And the message above says that this kthread hasn't had a chance
>> > > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
>> > > says that grace period #1566 is in progress, the "f0x0" says that no one
>> > > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
>> > > that the grace-period kthread has fully initialized the current grace
>> > > period and is sleeping for a few jiffies waiting to scan for idle tasks.
>> > > Finally, the "->state=0x1" says that the grace-period kthread is in
>> > > TASK_INTERRUPTIBLE state, in other words, still sleeping.  
>> > 
>> > Thanks for the explanation!  
>> > > 
>> > > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
>> > > split up config options) do to prevent the grace-period kthread from
>> > > getting a chance to run?"   
>> > 
>> > As far as we can tell it was a side effect of that patch.
>> > 
>> > The real cause is that patch changed the result of defconfigs to stop 
>> > running
>> > the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
>> > 
>> > Enabling that on 4.13-rc2 (and presumably everything in between)
>> > means we don't see the problem any more.
>> >   
>> > > I must confess that I don't see anything
>> > > obvious in that commit, so my second question is "Are we sure that
>> > > reverting this commit makes the problem go away?"  
>> > 
>> > Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
>> > That detector fires up a thread on every cpu, which may be relevant.  
>> 
>> Interesting...  Why should it be necessary to fire up a thread on every
>> CPU in order to make sure that RCU's grace-period kthreads get some
>> CPU time?  Especially give how many idle CPUs you had on your system.
>> 
>> So I have to ask if there is some other bug that the softlockup detector
>> is masking.
> I am thinking the same.  We can try going back further than 4.12 tomorrow
> (we think we can realistically go back to 4.8 and possibly 4.6
> with this board)

Just to report, turning softlockup back on fixes things for me on
sparc64 too.

The thing about softlockup is it runs an hrtimer, which seems to run
about every 4 seconds.

So I wonder if this is a NO_HZ problem.


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Paul E. McKenney
On Tue, Jul 25, 2017 at 10:42:45PM +0800, Jonathan Cameron wrote:
> On Tue, 25 Jul 2017 06:46:26 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> > > On Tue, 25 Jul 2017 19:32:10 +0800
> > > Jonathan Cameron  wrote:
> > >   
> > > > Hi All,
> > > > 
> > > > We observed a regression on our d05 boards (but curiously not
> > > > the fairly similar but single socket / smaller core count
> > > > d03), initially seen with linux-next prior to the merge window
> > > > and still present in v4.13-rc2.
> > > > 
> > > > The symptom is:  
> > 
> > Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
> > they have been seeing something similar, and you might well have saved
> > them the trouble of bisecting.
> > 
> > [ . . . ]
> > 
> > > > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 
> > > > c1565 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1  
> > 
> > This is the cause from an RCU perspective.  You had a lot of idle CPUs,
> > and RCU is not permitted to disturb them -- the battery-powered embedded
> > guys get very annoyed by that sort of thing.  What happens instead is
> > that each CPU updates a per-CPU state variable when entering or exiting
> > idle, and the grace-period kthread ("rcu_preempt kthread" in the above
> > message) checks these state variables, and if when sees an idle CPU,
> > it reports a quiescent state on that CPU's behalf.
> > 
> > But the grace-period kthread can only do this work if it gets a chance
> > to run.  And the message above says that this kthread hasn't had a chance
> > to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
> > says that grace period #1566 is in progress, the "f0x0" says that no one
> > is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
> > that the grace-period kthread has fully initialized the current grace
> > period and is sleeping for a few jiffies waiting to scan for idle tasks.
> > Finally, the "->state=0x1" says that the grace-period kthread is in
> > TASK_INTERRUPTIBLE state, in other words, still sleeping.
> 
> Thanks for the explanation!
> > 
> > So my first question is "What did commit 05a4a9527 (kernel/watchdog:
> > split up config options) do to prevent the grace-period kthread from
> > getting a chance to run?" 
> 
> As far as we can tell it was a side effect of that patch.
> 
> The real cause is that patch changed the result of defconfigs to stop running
> the softlockup detector - now CONFIG_SOFTLOCKUP_DETECTOR
> 
> Enabling that on 4.13-rc2 (and presumably everything in between)
> means we don't see the problem any more.
> 
> > I must confess that I don't see anything
> > obvious in that commit, so my second question is "Are we sure that
> > reverting this commit makes the problem go away?"
> 
> Simply enabling CONFIG_SOFTLOCKUP_DETECTOR seems to make it go away.
> That detector fires up a thread on every cpu, which may be relevant.

Interesting...  Why should it be necessary to fire up a thread on every
CPU in order to make sure that RCU's grace-period kthreads get some
CPU time?  Especially give how many idle CPUs you had on your system.

So I have to ask if there is some other bug that the softlockup detector
is masking.

> > and my third is "Is
> > this an intermittent problem that led to a false bisection?"
> 
> Whilst it is a bit slow to occur, we verified with long runs on either
> side of that patch and since with the option enabled on latest mainline.
> 
> Also can cause the issue before that patch by disabling the previous
> relevant option on 4.12.

OK, thank you -- hard to argue with that!  ;-)

Except that I am still puzzled as to why per-CPU softlockup threads
are needed for RCU's kthreads to get their wakeups.  We really should
be able to disable softlockup and still have kthreads get wakeups and
access to CPU, after all.

> > [ . . . ]
> > 
> > > > Reducing the RCU CPU stall timeout makes it happen more often,
> > > > but we are seeing even with the default value of 24 seconds.
> > > > 
> > > > Tends to occur after a period or relatively low usage, but has
> > > > also been seen mid way through performance tests.
> > > > 
> > > > This was not seen with v4.12 so a bisection run later lead to
> > > > commit 05a4a9527 (kernel/watchdog: split up config options).
> > > > 
> > > > Which was odd until we discovered that a side effect of this patch
> > > > was to change whether the softlockup detector was enabled or not in
> > > > the arm64 defconfig.
> > > > 
> > > > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > > > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > > > issue occur there as well.
> > > > 
> > > > Clearly the softlockup detector results in a thread on every cpu,
> > > > which might be related but beyond that we are still looking into
> > > > the issue.
> > > > 
> > > > So the obvious question is whether anyone 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Paul E. McKenney
On Tue, Jul 25, 2017 at 10:26:54PM +1000, Nicholas Piggin wrote:
> On Tue, 25 Jul 2017 19:32:10 +0800
> Jonathan Cameron  wrote:
> 
> > Hi All,
> > 
> > We observed a regression on our d05 boards (but curiously not
> > the fairly similar but single socket / smaller core count
> > d03), initially seen with linux-next prior to the merge window
> > and still present in v4.13-rc2.
> > 
> > The symptom is:

Adding Dave Miller and the sparcli...@vger.kernel.org email on CC, as
they have been seeing something similar, and you might well have saved
them the trouble of bisecting.

[ . . . ]

> > [ 1984.628602] rcu_preempt kthread starved for 5663 jiffies! g1566 c1565 
> > f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1

This is the cause from an RCU perspective.  You had a lot of idle CPUs,
and RCU is not permitted to disturb them -- the battery-powered embedded
guys get very annoyed by that sort of thing.  What happens instead is
that each CPU updates a per-CPU state variable when entering or exiting
idle, and the grace-period kthread ("rcu_preempt kthread" in the above
message) checks these state variables, and if when sees an idle CPU,
it reports a quiescent state on that CPU's behalf.

But the grace-period kthread can only do this work if it gets a chance
to run.  And the message above says that this kthread hasn't had a chance
to run for a full 5,663 jiffies.  For completeness, the "g1566 c1565"
says that grace period #1566 is in progress, the "f0x0" says that no one
is needing another grace period #1567.  The "RCU_GP_WAIT_FQS(3)" says
that the grace-period kthread has fully initialized the current grace
period and is sleeping for a few jiffies waiting to scan for idle tasks.
Finally, the "->state=0x1" says that the grace-period kthread is in
TASK_INTERRUPTIBLE state, in other words, still sleeping.

So my first question is "What did commit 05a4a9527 (kernel/watchdog:
split up config options) do to prevent the grace-period kthread from
getting a chance to run?"  I must confess that I don't see anything
obvious in that commit, so my second question is "Are we sure that
reverting this commit makes the problem go away?" and my third is "Is
this an intermittent problem that led to a false bisection?"

[ . . . ]

> > Reducing the RCU CPU stall timeout makes it happen more often,
> > but we are seeing even with the default value of 24 seconds.
> > 
> > Tends to occur after a period or relatively low usage, but has
> > also been seen mid way through performance tests.
> > 
> > This was not seen with v4.12 so a bisection run later lead to
> > commit 05a4a9527 (kernel/watchdog: split up config options).
> > 
> > Which was odd until we discovered that a side effect of this patch
> > was to change whether the softlockup detector was enabled or not in
> > the arm64 defconfig.
> > 
> > On 4.13-rc2 enabling the softlockup detector indeed stopped us
> > seeing the rcu issue. Disabling the equivalent on 4.12 made the
> > issue occur there as well.
> > 
> > Clearly the softlockup detector results in a thread on every cpu,
> > which might be related but beyond that we are still looking into
> > the issue.
> > 
> > So the obvious question is whether anyone else is seeing this as
> > it might help us to focus in on where to look!
> 
> Huh. Something similar has been seen very intermittently on powerpc
> as well. We couldn't reproduce it reliably to bisect it already, so
> this is a good help.
> 
> http://marc.info/?l=linuxppc-embedded=149872815523646=2
> 
> It looks like the watchdog patch has a similar effect on powerpc in
> that it stops enabling the softlockup detector by default. Haven't
> confirmed, but it looks like the same thing.
> 
> A bug in RCU stall detection?

Well, if I am expected to make grace periods complete when my grace-period
kthreads aren't getting any CPU time, I will have to make some substantial
changes.  ;-)

One possibility is that the timer isn't firing and another is that the
timer's wakeup is being lost somehow.

So another thing to try is to boot with rcutree.rcu_kick_kthreads=1.
This will cause RCU to do redundant wakeups on the grace-period kthread
if the grace period is moving slowly.  This is of course a crude hack,
which is why this boot parameter will also cause a splat if it ever has
to do anything.

Does this help at all?

Thanx, Paul

> > In the meantime we'll carry on digging.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > p.s. As a more general question, do we want to have the
> > soft lockup detector enabledon arm64 by default?
> 
> I've cc'ed Don. My patch should not have changed defconfigs, I
> should have been more careful with that.
> 
> Thanks,
> Nick
> 



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-25 Thread Nicholas Piggin
On Tue, 25 Jul 2017 19:32:10 +0800
Jonathan Cameron  wrote:

> Hi All,
> 
> We observed a regression on our d05 boards (but curiously not
> the fairly similar but single socket / smaller core count
> d03), initially seen with linux-next prior to the merge window
> and still present in v4.13-rc2.
> 
> The symptom is:
> 
> [ 1982.959365] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 1982.965021]2-...: (10 GPs behind) idle=1d4/0/0 softirq=306/306 
> fqs=0 
> [ 1982.971624]3-...: (2 GPs behind) idle=700/0/0 softirq=307/307 
> fqs=0 
> [ 1982.978139]4-...: (20 GPs behind) idle=9f4/0/0 softirq=651/652 
> fqs=0 
> [ 1982.984740]5-...: (18 GPs behind) idle=a78/0/0 softirq=369/371 
> fqs=0 
> [ 1982.991342]6-...: (26 GPs behind) idle=e5c/0/0 softirq=217/219 
> fqs=0 
> [ 1982.997944]7-...: (1438 GPs behind) idle=eb4/0/0 softirq=260/260 
> fqs=0 
> [ 1983.004719]8-...: (18 GPs behind) idle=830/0/0 softirq=1609/1609 
> fqs=0 
> [ 1983.011494]9-...: (18 GPs behind) idle=e9c/0/0 softirq=242/242 
> fqs=0 
> [ 1983.018095]10-...: (1434 GPs behind) idle=ca0/0/0 softirq=210/212 
> fqs=0 
> [ 1983.024957]11-...: (1106 GPs behind) idle=ee0/0/0 softirq=188/191 
> fqs=0 
> [ 1983.031819]12-...: (1636 GPs behind) idle=c58/0/0 softirq=215/216 
> fqs=0 
> [ 1983.038680]13-...: (1114 GPs behind) idle=c20/0/0 softirq=170/170 
> fqs=0 
> [ 1983.045542]14-...: (1106 GPs behind) idle=d90/0/0 softirq=176/178 
> fqs=0 
> [ 1983.052403]15-...: (1858 GPs behind) idle=900/0/0 softirq=184/185 
> fqs=0 
> [ 1983.059266]16-...: (1621 GPs behind) idle=f04/0/0 softirq=204/206 
> fqs=0 
> [ 1983.066127]17-...: (1433 GPs behind) idle=d30/0/0 softirq=202/202 
> fqs=0 
> [ 1983.072988]18-...: (18 GPs behind) idle=2d4/0/0 softirq=218/220 
> fqs=0 
> [ 1983.079676]19-...: (19 GPs behind) idle=bbc/0/0 softirq=178/180 
> fqs=0 
> [ 1983.086364]20-...: (0 ticks this GP) idle=ee0/0/0 softirq=231/231 
> fqs=0 
> [ 1983.093226]21-...: (4 GPs behind) idle=140/0/0 softirq=208/208 
> fqs=0 
> [ 1983.099827]22-...: (5 GPs behind) idle=100/0/0 softirq=186/188 
> fqs=0 
> [ 1983.106428]23-...: (1635 GPs behind) idle=fd4/0/0 
> softirq=1220/1221 fqs=0 
> [ 1983.113463]24-...: (1112 GPs behind) idle=ca8/0/0 softirq=231/233 
> fqs=0 
> [ 1983.120325]25-...: (1637 GPs behind) idle=9c4/0/0 softirq=164/166 
> fqs=0 
> [ 1983.127187]27-...: (0 ticks this GP) idle=b08/0/0 softirq=182/182 
> fqs=0 
> [ 1983.134048]28-...: (1110 GPs behind) idle=d28/0/0 softirq=179/181 
> fqs=0 
> [ 1983.140909]29-...: (8 GPs behind) idle=1dc/0/0 softirq=196/198 
> fqs=0 
> [ 1983.147511]31-...: (1434 GPs behind) idle=74c/0/0 softirq=160/161 
> fqs=0 
> [ 1983.154373]32-...: (1432 GPs behind) idle=7d4/0/0 softirq=164/164 
> fqs=0 
> [ 1983.161234]33-...: (1632 GPs behind) idle=4dc/0/0 softirq=130/132 
> fqs=0 
> [ 1983.168096]34-...: (57 GPs behind) idle=3b0/0/0 softirq=411/411 
> fqs=0 
> [ 1983.174784]35-...: (1599 GPs behind) idle=8a0/0/0 softirq=177/179 
> fqs=0 
> [ 1983.181646]36-...: (1603 GPs behind) idle=520/0/0 softirq=132/134 
> fqs=0 
> [ 1983.188507]37-...: (8 GPs behind) idle=02c/0/0 softirq=176/178 
> fqs=0 
> [ 1983.195108]38-...: (1442 GPs behind) idle=d8c/0/0 
> softirq=3189/3190 fqs=0 
> [ 1983.202144]39-...: (1431 GPs behind) idle=444/0/0 softirq=117/117 
> fqs=0 
> [ 1983.209005]40-...: (4 GPs behind) idle=688/0/0 softirq=134/136 
> fqs=0 
> [ 1983.215606]41-...: (1599 GPs behind) idle=554/0/0 
> softirq=2707/2711 fqs=0 
> [ 1983.222642]42-...: (1430 GPs behind) idle=15c/0/0 softirq=110/111 
> fqs=0 
> [ 1983.229503]43-...: (4 GPs behind) idle=054/0/0 softirq=101/103 
> fqs=0 
> [ 1983.236104]46-...: (1117 GPs behind) idle=558/0/0 softirq=251/253 
> fqs=0 
> [ 1983.242966]47-...: (1118 GPs behind) idle=5f0/0/0 softirq=110/112 
> fqs=0 
> [ 1983.249827]48-...: (1621 GPs behind) idle=ef4/0/0 softirq=241/242 
> fqs=0 
> [ 1983.256689]49-...: (1648 GPs behind) idle=92c/0/0 softirq=207/208 
> fqs=0 
> [ 1983.263550]52-...: (1439 GPs behind) idle=e40/0/0 softirq=261/263 
> fqs=0 
> [ 1983.270412]54-...: (1434 GPs behind) idle=650/0/0 softirq=258/260 
> fqs=0 
> [ 1983.277273]55-...: (1646 GPs behind) idle=5e0/0/0 softirq=178/178 
> fqs=0 
> [ 1983.284135]56-...: (1646 GPs behind) idle=800/0/0 softirq=249/249 
> fqs=0 
> [ 1983.290996]57-...: (1599 GPs behind) idle=c48/0/0 softirq=222/224 
> fqs=0 
> [ 1983.297858]58-...: (1648 GPs behind) idle=3e8/0/0 softirq=235/235 
> fqs=0 
> [ 1983.304719]59-...: (1434 GPs behind) idle=720/0/0 softirq=201/203 
> fqs=0 
> [ 1983.311581]60-...: (1647 GPs behind) idle=c80/0/0 softirq=250/250 
>