Re: Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)

2022-04-14 Thread Paul E. McKenney
On Wed, Apr 13, 2022 at 04:10:02PM +1000, Nicholas Piggin wrote: > Oops, fixed subject... > > Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm: > > +Daniel, Thomas, Viresh > > > > Subject: Re: rcu_sched self-detected stall on CPU > > > &

Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)

2022-04-12 Thread Nicholas Piggin
Oops, fixed subject... Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm: > +Daniel, Thomas, Viresh > > Subject: Re: rcu_sched self-detected stall on CPU > > Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am: >> Michael Ellerman

Re: rcu_sched self-detected stall on CPU

2022-04-12 Thread Paul E. McKenney
On Tue, Apr 12, 2022 at 04:53:06PM +1000, Michael Ellerman wrote: > "Paul E. McKenney" writes: > > On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote: > >> Zhouyi Zhou writes: > >> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney > >> > wrote: > >> >> On Fri, Apr 08, 2022 at 06:

Re: rcu_sched self-detected stall on CPU

2022-04-11 Thread Michael Ellerman
"Paul E. McKenney" writes: > On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote: >> Zhouyi Zhou writes: >> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney >> > wrote: >> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: >> >> > On Fri, Apr 8, 2022 at 3:23 PM Micha

Re: rcu_sched self-detected stall on CPU

2022-04-10 Thread Paul E. McKenney
On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote: > Zhouyi Zhou writes: > > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney wrote: > >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: > >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman > >> > wrote: > ... > >> >

Re: rcu_sched self-detected stall on CPU

2022-04-10 Thread Michael Ellerman
Zhouyi Zhou writes: > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney wrote: >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman >> > wrote: ... >> > > I haven't seen it in my testing. But using Miguel's config I can >> > > repro

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Miguel Ojeda
On Fri, Apr 8, 2022 at 4:42 PM Michael Ellerman wrote: > > The Rust CI has it disabled because I copied that from the x86 defconfig > they were using back when I added the Rust support. I think that was > meant to be a stripped down fast config for CI, but the result is it's Indeed, that was my i

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Paul E. McKenney
On Sat, Apr 09, 2022 at 12:42:39AM +1000, Michael Ellerman wrote: > Michael Ellerman writes: > > "Paul E. McKenney" writes: > >> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > >>> Hi > >>> > >>> I can reproduce it in a ppc virtual cloud server provided by Oregon > >>> State Unive

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Michael Ellerman
Michael Ellerman writes: > "Paul E. McKenney" writes: >> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: >>> Hi >>> >>> I can reproduce it in a ppc virtual cloud server provided by Oregon >>> State University. Following is what I do: >>> 1) curl -l >>> https://git.kernel.org/pub/s

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Zhouyi Zhou
On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney wrote: > > On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: > > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman wrote: > > > > > > "Paul E. McKenney" writes: > > > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > > > >>

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Paul E. McKenney
On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman wrote: > > > > "Paul E. McKenney" writes: > > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > > >> Hi > > >> > > >> I can reproduce it in a ppc virtual cloud server prov

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Paul E. McKenney
On Fri, Apr 08, 2022 at 05:23:32PM +1000, Michael Ellerman wrote: > "Paul E. McKenney" writes: > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > >> Hi > >> > >> I can reproduce it in a ppc virtual cloud server provided by Oregon > >> State University. Following is what I do: > >

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Miguel Ojeda
On Fri, Apr 8, 2022 at 9:23 AM Michael Ellerman wrote: > > I haven't seen it in my testing. But using Miguel's config I can > reproduce it seemingly on every boot. Hmm... I noticed this for some kernel builds: in some builds/commits, it triggered the very first time, while in others I had to re-t

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Zhouyi Zhou
On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman wrote: > > "Paul E. McKenney" writes: > > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > >> Hi > >> > >> I can reproduce it in a ppc virtual cloud server provided by Oregon > >> State University. Following is what I do: > >> 1) curl

Re: rcu_sched self-detected stall on CPU

2022-04-08 Thread Michael Ellerman
"Paul E. McKenney" writes: > On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: >> Hi >> >> I can reproduce it in a ppc virtual cloud server provided by Oregon >> State University. Following is what I do: >> 1) curl -l >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Paul E. McKenney
On Fri, Apr 08, 2022 at 07:14:20AM +0800, Zhouyi Zhou wrote: > Dear Paul and Miguel > > On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney wrote: > > > > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote: > > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney > > > wrote: > > > > > > > >

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Zhouyi Zhou
Dear Paul and Miguel On Fri, Apr 8, 2022 at 1:55 AM Paul E. McKenney wrote: > > On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote: > > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney wrote: > > > > > > Ah. So you would instead look for boot to have completed within 10 > > > seconds?

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Paul E. McKenney
On Thu, Apr 07, 2022 at 07:05:58PM +0200, Miguel Ojeda wrote: > On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney wrote: > > > > Ah. So you would instead look for boot to have completed within 10 > > seconds? Either way, reliable automation might well more important than > > reduction in time. >

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Miguel Ojeda
On Thu, Apr 7, 2022 at 5:15 PM Paul E. McKenney wrote: > > Ah. So you would instead look for boot to have completed within 10 > seconds? Either way, reliable automation might well more important than > reduction in time. No (although I guess that could be an option), I was only pointing out tha

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Paul E. McKenney
On Thu, Apr 07, 2022 at 12:07:34PM +0200, Miguel Ojeda wrote: > On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou wrote: > > > > Yes, this happens within 30 seconds after kernel boot. If we take all > > into account (qemu preparing, kernel loading), we can do one test > > within 54 seconds. > > When it

Re: rcu_sched self-detected stall on CPU

2022-04-07 Thread Miguel Ojeda
On Thu, Apr 7, 2022 at 4:27 AM Zhouyi Zhou wrote: > > Yes, this happens within 30 seconds after kernel boot. If we take all > into account (qemu preparing, kernel loading), we can do one test > within 54 seconds. When it does not trigger, the run should be 20 seconds quicker than that (e.g. 10 s

Re: rcu_sched self-detected stall on CPU

2022-04-06 Thread Zhouyi Zhou
l > > > Thanks > > Zhouyi > > > > > > Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period > > > kthread slept for three milliseconds, but did not wake up for more than > > > 20 seconds. This kthread would normally have awakened on CP

Re: rcu_sched self-detected stall on CPU

2022-04-06 Thread Paul E. McKenney
on CPU 1, but > > CPU 1 looks to me to be very unhealthy, as can be seen in your console > > output below (but maybe my idea of what is healthy for powerpc systems > > is outdated). Please see also the inline annotations. > > > > Thoughts from the PPC guys? > > > >

Re: rcu_sched self-detected stall on CPU

2022-04-06 Thread Zhouyi Zhou
see also the inline annotations. > > Thoughts from the PPC guys? > > Thanx, Paul > > > > [ 21.186912] rcu: INFO: rcu_sched self-detected stall on CPU > [ 21.187331] rcu: 1-...!: (4712629 ticks this

Re: rcu_sched self-detected stall on CPU

2022-04-06 Thread Paul E. McKenney
---- [ 21.186912] rcu: INFO: rcu_sched self-detected stall on CPU [ 21.187331] rcu: 1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 [ 21.187529] (t=21000 jiffies g=-1183 q=3) [ 21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f

Re: rcu_sched self-detected stall on CPU

2022-04-06 Thread Zhouyi Zhou
Hi I can reproduce it in a ppc virtual cloud server provided by Oregon State University. Following is what I do: 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz -o linux-5.18-rc1.tar.gz 2) tar zxf linux-5.18-rc1.tar.gz 3) cp config lin

rcu_sched self-detected stall on CPU

2022-04-05 Thread Miguel Ojeda
Hi PPC/RCU, While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I reproduced the problem in v5.18-rc1 as well as next-20220405, under both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in v5.17 from a few tries. Sadly, the problem is not deterministic although it is