Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-31 Thread Michel Lespinasse
On Sat, Jul 30, 2022 at 09:52:34PM +0200, Rafael J. Wysocki wrote:
> On Sat, Jul 30, 2022 at 11:48 AM Michel Lespinasse
>  wrote:
> > I'm not sure if that was the patch you meant to send though, as it
> > seems it's only adding a tracepoint so shouldn't make any difference
> > if I'm not actually using the tracepoint ?
> 
> You are right, it looks like I pasted a link to a different patch by
> mistake.  Sorry about that.
> 
> I meant this one:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=pm=d295ad34f236c3518634fb6403d4c0160456e470
> 
> which will appear in the final 5.19.

Thanks. I can confirm that this patch fixes the boot time debug
warnings for me. And I see that linus already merged it, nice!

--
Michel "walken" Lespinasse.


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Rafael J. Wysocki
On Sat, Jul 30, 2022 at 11:48 AM Michel Lespinasse
 wrote:
>
> On Fri, Jul 29, 2022 at 04:59:50PM +0200, Rafael J. Wysocki wrote:
> > On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
> >  wrote:
> > >
> > > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > > Xeons") wrecked intel_idle in two ways:
> > > > > >
> > > > > >  - must not have tracing in idle functions
> > > > > >  - must return with IRQs disabled
> > > > > >
> > > > > > Additionally, it added a branch for no good reason.
> > > > > >
> > > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on 
> > > > > > Xeons")
> > > > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > > >
> > > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > > usage" when booting a kernel with debug options compiled in. Please
> > > > > see the attached dmesg output. The issue starts with commit 
> > > > > 32d4fd5751ea
> > > > > and is still present in v5.19-rc8.
> > > > >
> > > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > > >
> > > > I finally got a chance to take a quick look at this.
> > > >
> > > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > > being invoked with interrupts enabled.  This function is called from
> > > > rcu_idle_exit(), which is an expected code path from 
> > > > cpuidle_enter_state()
> > > > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > > > interrupts before invoking rcu_eqs_exit().
> > > >
> > > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > > cpuidle_enter_state().
> > > >
> > > > It seems unlikely, but it might be that it is the use of 
> > > > local_irq_save()
> > > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > > the trouble.  If this is the case, then the commit shown below would
> > > > help.  Note that this commit removes the warning from lockdep, so it
> > > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > > equivalent debugging.
> > > >
> > > > Could you please try your test with the -rce commit shown below applied?
> > >
> > > Thanks for looking into it.
> > >
> > > After checking out Peter's commit 32d4fd5751ea,
> > > cherry picking your commit ed4ae5eff4b3,
> > > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > > I am now seeing this a few seconds into the boot:
> > >
> > > [3.010650] [ cut here ]
> > > [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> > > sched_clock_tick+0x27/0x60
> > > [3.010657] Modules linked in:
> > > [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > > 5.19.0-rc1-test-5-g1be22fea0611 #1
> > > [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 
> > > 01/17/2022
> > > [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> > > [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 
> > > c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 
> > > 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> > >  89 c0 48 03 1c c5 c0 98
> > > [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> > > [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> > > 0001
> > > [3.010671] RDX:  RSI: b268dd21 RDI: 
> > > b269ab13
> > > [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> > > 0002be80
> > > [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> > > b2aa9e80
> > > [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> > > 
> > > [3.010677] FS:  () GS:a012b800() 
> > > knlGS:
> > > [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> > > [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> > > 003706f0
> > > [3.010681] DR0:  DR1:  DR2: 
> > > 
> > > [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > > [3.010683] Call Trace:
> > > [3.010685]  
> > > [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> > > [3.010694]  cpuidle_enter+0x29/0x40
> > > [3.010697]  do_idle+0x1d4/0x210
> > > [3.010702]  cpu_startup_entry+0x19/0x20
> > > [3.010704]  rest_init+0x117/0x1a0
> > > [3.010708]  arch_call_rest_init+0xa/0x10
> > > [3.010711]  start_kernel+0x6d8/0x6ff
> > > [3.010716]  secondary_startup_64_no_verify+0xce/0xdb
> > > [3.010728]  
> > > [3.010729] irq event stamp: 44179
> > > [3.010730] 

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Paul E. McKenney
On Sat, Jul 30, 2022 at 02:40:32AM -0700, Michel Lespinasse wrote:
> On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:> Would you 
> be willing to try another shot in the dark, but untested
> > this time?  I freely admit that this is getting strange.
> > 
> > Thanx, Paul
> 
> Yes, adding this second change got rid of the boot time warning for me.

OK, I will make a real patch.  May I have your Tested-by?

Thanx, Paul

> > 
> > 
> > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > index e374c0c923dae..279f557bf60bb 100644
> > --- a/kernel/sched/clock.c
> > +++ b/kernel/sched/clock.c
> > @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
> > if (!static_branch_likely(_clock_running))
> > return;
> >  
> > -   lockdep_assert_irqs_disabled();
> > +   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  
> > scd = this_scd();
> > __scd_stamp(scd);


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Michel Lespinasse
On Fri, Jul 29, 2022 at 04:59:50PM +0200, Rafael J. Wysocki wrote:
> On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
>  wrote:
> >
> > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > Xeons") wrecked intel_idle in two ways:
> > > > >
> > > > >  - must not have tracing in idle functions
> > > > >  - must return with IRQs disabled
> > > > >
> > > > > Additionally, it added a branch for no good reason.
> > > > >
> > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on 
> > > > > Xeons")
> > > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > >
> > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > usage" when booting a kernel with debug options compiled in. Please
> > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > and is still present in v5.19-rc8.
> > > >
> > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > >
> > > I finally got a chance to take a quick look at this.
> > >
> > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > being invoked with interrupts enabled.  This function is called from
> > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > > interrupts before invoking rcu_eqs_exit().
> > >
> > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > cpuidle_enter_state().
> > >
> > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > the trouble.  If this is the case, then the commit shown below would
> > > help.  Note that this commit removes the warning from lockdep, so it
> > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > equivalent debugging.
> > >
> > > Could you please try your test with the -rce commit shown below applied?
> >
> > Thanks for looking into it.
> >
> > After checking out Peter's commit 32d4fd5751ea,
> > cherry picking your commit ed4ae5eff4b3,
> > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > I am now seeing this a few seconds into the boot:
> >
> > [3.010650] [ cut here ]
> > [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> > sched_clock_tick+0x27/0x60
> > [3.010657] Modules linked in:
> > [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 5.19.0-rc1-test-5-g1be22fea0611 #1
> > [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 
> > 01/17/2022
> > [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> > [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 
> > 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 
> > <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> >  89 c0 48 03 1c c5 c0 98
> > [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> > [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> > 0001
> > [3.010671] RDX:  RSI: b268dd21 RDI: 
> > b269ab13
> > [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> > 0002be80
> > [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> > b2aa9e80
> > [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> > 
> > [3.010677] FS:  () GS:a012b800() 
> > knlGS:
> > [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> > [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> > 003706f0
> > [3.010681] DR0:  DR1:  DR2: 
> > 
> > [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [3.010683] Call Trace:
> > [3.010685]  
> > [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> > [3.010694]  cpuidle_enter+0x29/0x40
> > [3.010697]  do_idle+0x1d4/0x210
> > [3.010702]  cpu_startup_entry+0x19/0x20
> > [3.010704]  rest_init+0x117/0x1a0
> > [3.010708]  arch_call_rest_init+0xa/0x10
> > [3.010711]  start_kernel+0x6d8/0x6ff
> > [3.010716]  secondary_startup_64_no_verify+0xce/0xdb
> > [3.010728]  
> > [3.010729] irq event stamp: 44179
> > [3.010730] hardirqs last  enabled at (44179): [] 
> > asm_sysvec_apic_timer_interrupt+0x1b/0x20
> > [3.010734] hardirqs last disabled at (44177): [] 
> > __do_softirq+0x3f0/0x498
> > [3.010736] softirqs last  enabled at (44178): [] 
> > __do_softirq+0x332/0x498
> > [3.010738] softirqs last 

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-30 Thread Michel Lespinasse
On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:> Would you be 
willing to try another shot in the dark, but untested
> this time?  I freely admit that this is getting strange.
> 
>   Thanx, Paul

Yes, adding this second change got rid of the boot time warning for me.

> 
> 
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index e374c0c923dae..279f557bf60bb 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -394,7 +394,7 @@ notrace void sched_clock_tick(void)
>   if (!static_branch_likely(_clock_running))
>   return;
>  
> - lockdep_assert_irqs_disabled();
> + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>  
>   scd = this_scd();
>   __scd_stamp(scd);


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Paul E. McKenney
Or better yet, try the patch that Rafael proposed.  ;-)

Thanx, Paul

On Fri, Jul 29, 2022 at 08:26:22AM -0700, Paul E. McKenney wrote:
> On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> > On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > > Xeons") wrecked intel_idle in two ways:
> > > > > 
> > > > >  - must not have tracing in idle functions
> > > > >  - must return with IRQs disabled
> > > > > 
> > > > > Additionally, it added a branch for no good reason.
> > > > > 
> > > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on 
> > > > > Xeons")
> > > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > > 
> > > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > > usage" when booting a kernel with debug options compiled in. Please
> > > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > > and is still present in v5.19-rc8.
> > > > 
> > > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > > 
> > > I finally got a chance to take a quick look at this.
> > > 
> > > The rcu_eqs_exit() function is making a lockdep complaint about
> > > being invoked with interrupts enabled.  This function is called from
> > > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > > interrupts before invoking rcu_eqs_exit().
> > > 
> > > The only other call to rcu_idle_exit() does not disable interrupts,
> > > but it is via rcu_user_exit(), which would be a very odd choice for
> > > cpuidle_enter_state().
> > > 
> > > It seems unlikely, but it might be that it is the use of local_irq_save()
> > > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > > the trouble.  If this is the case, then the commit shown below would
> > > help.  Note that this commit removes the warning from lockdep, so it
> > > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > > equivalent debugging.
> > > 
> > > Could you please try your test with the -rce commit shown below applied?
> > 
> > Thanks for looking into it.
> 
> And thank you for trying this shot in the dark!
> 
> > After checking out Peter's commit 32d4fd5751ea,
> > cherry picking your commit ed4ae5eff4b3,
> > and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> > I am now seeing this a few seconds into the boot:
> > 
> > [3.010650] [ cut here ]
> > [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> > sched_clock_tick+0x27/0x60
> 
> And this is again a complaint about interrupts not being disabled.
> 
> But it does appear that the problem was the lockdep complaint, and
> eliminating that did take care of part of the problem.  But lockdep
> remained enabled, and you therefore hit the next complaint.
> 
> > [3.010657] Modules linked in:
> > [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 5.19.0-rc1-test-5-g1be22fea0611 #1
> > [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 
> > 01/17/2022
> > [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> 
> The most straightforward way to get to sched_clock_tick() from
> cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().
> 
> Except that it disables interrupts before invoking sched_clock_tick().
> 
> > [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 
> > 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 
> > <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> >  89 c0 48 03 1c c5 c0 98
> > [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> > [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> > 0001
> > [3.010671] RDX:  RSI: b268dd21 RDI: 
> > b269ab13
> > [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> > 0002be80
> > [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> > b2aa9e80
> > [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> > 
> > [3.010677] FS:  () GS:a012b800() 
> > knlGS:
> > [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> > [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> > 003706f0
> > [3.010681] DR0:  DR1:  DR2: 
> > 
> > [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [3.010683] Call Trace:
> > [3.010685]  
> > [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> > [

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Paul E. McKenney
On Fri, Jul 29, 2022 at 03:24:58AM -0700, Michel Lespinasse wrote:
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > > 
> > > >  - must not have tracing in idle functions
> > > >  - must return with IRQs disabled
> > > > 
> > > > Additionally, it added a branch for no good reason.
> > > > 
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) 
> > > 
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > > 
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> > 
> > I finally got a chance to take a quick look at this.
> > 
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled.  This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> > 
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> > 
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble.  If this is the case, then the commit shown below would
> > help.  Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> > 
> > Could you please try your test with the -rce commit shown below applied?
> 
> Thanks for looking into it.

And thank you for trying this shot in the dark!

> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
> 
> [3.010650] [ cut here ]
> [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> sched_clock_tick+0x27/0x60

And this is again a complaint about interrupts not being disabled.

But it does appear that the problem was the lockdep complaint, and
eliminating that did take care of part of the problem.  But lockdep
remained enabled, and you therefore hit the next complaint.

> [3.010657] Modules linked in:
> [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.19.0-rc1-test-5-g1be22fea0611 #1
> [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60

The most straightforward way to get to sched_clock_tick() from
cpuidle_enter_state() is via the call to sched_clock_idle_wakeup_event().

Except that it disables interrupts before invoking sched_clock_tick().

> [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 
> 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b 
> e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
>  89 c0 48 03 1c c5 c0 98
> [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> 0001
> [3.010671] RDX:  RSI: b268dd21 RDI: 
> b269ab13
> [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> 0002be80
> [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> b2aa9e80
> [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> 
> [3.010677] FS:  () GS:a012b800() 
> knlGS:
> [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> 003706f0
> [3.010681] DR0:  DR1:  DR2: 
> 
> [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3.010683] Call Trace:
> [3.010685]  
> [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> [3.010694]  cpuidle_enter+0x29/0x40
> [3.010697]  do_idle+0x1d4/0x210
> [3.010702]  cpu_startup_entry+0x19/0x20
> [3.010704]  rest_init+0x117/0x1a0
> [3.010708]  arch_call_rest_init+0xa/0x10
> [3.010711]  start_kernel+0x6d8/0x6ff
> [3.010716]  secondary_startup_64_no_verify+0xce/0xdb
> [3.010728]  
> [3.010729] irq event stamp: 44179
> [3.010730] hardirqs last  enabled at 

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Rafael J. Wysocki
On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
 wrote:
>
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > >
> > > >  - must not have tracing in idle functions
> > > >  - must return with IRQs disabled
> > > >
> > > > Additionally, it added a branch for no good reason.
> > > >
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) 
> > >
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > >
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> >
> > I finally got a chance to take a quick look at this.
> >
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled.  This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> >
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> >
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble.  If this is the case, then the commit shown below would
> > help.  Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> >
> > Could you please try your test with the -rce commit shown below applied?
>
> Thanks for looking into it.
>
> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
>
> [3.010650] [ cut here ]
> [3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
> sched_clock_tick+0x27/0x60
> [3.010657] Modules linked in:
> [3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.19.0-rc1-test-5-g1be22fea0611 #1
> [3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> [3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 
> 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b 
> e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
>  89 c0 48 03 1c c5 c0 98
> [3.010667] RSP: :b2803e28 EFLAGS: 00010002
> [3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 
> 0001
> [3.010671] RDX:  RSI: b268dd21 RDI: 
> b269ab13
> [3.010673] RBP: 0001 R08: ffc300d5 R09: 
> 0002be80
> [3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: 
> b2aa9e80
> [3.010675] R13: b2aa9e00 R14: 0001 R15: 
> 
> [3.010677] FS:  () GS:a012b800() 
> knlGS:
> [3.010678] CS:  0010 DS:  ES:  CR0: 80050033
> [3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 
> 003706f0
> [3.010681] DR0:  DR1:  DR2: 
> 
> [3.010682] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3.010683] Call Trace:
> [3.010685]  
> [3.010688]  cpuidle_enter_state+0xb7/0x4b0
> [3.010694]  cpuidle_enter+0x29/0x40
> [3.010697]  do_idle+0x1d4/0x210
> [3.010702]  cpu_startup_entry+0x19/0x20
> [3.010704]  rest_init+0x117/0x1a0
> [3.010708]  arch_call_rest_init+0xa/0x10
> [3.010711]  start_kernel+0x6d8/0x6ff
> [3.010716]  secondary_startup_64_no_verify+0xce/0xdb
> [3.010728]  
> [3.010729] irq event stamp: 44179
> [3.010730] hardirqs last  enabled at (44179): [] 
> asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [3.010734] hardirqs last disabled at (44177): [] 
> __do_softirq+0x3f0/0x498
> [3.010736] softirqs last  enabled at (44178): [] 
> __do_softirq+0x332/0x498
> [3.010738] softirqs last disabled at (44171): [] 
> irq_exit_rcu+0xab/0xf0
> [3.010741] ---[ end trace  ]---

Can you please give this patch a go:
https://patchwork.kernel.org/project/linux-pm/patch/Yt/axpfi88new...@e126311.manchester.arm.com/
?


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-29 Thread Michel Lespinasse
On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > > 
> > >  - must not have tracing in idle functions
> > >  - must return with IRQs disabled
> > > 
> > > Additionally, it added a branch for no good reason.
> > > 
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > Signed-off-by: Peter Zijlstra (Intel) 
> > 
> > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > usage" when booting a kernel with debug options compiled in. Please
> > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > and is still present in v5.19-rc8.
> > 
> > I'm not sure, is this too late to fix or revert in v5.19 final ?
> 
> I finally got a chance to take a quick look at this.
> 
> The rcu_eqs_exit() function is making a lockdep complaint about
> being invoked with interrupts enabled.  This function is called from
> rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
> interrupts before invoking rcu_eqs_exit().
> 
> The only other call to rcu_idle_exit() does not disable interrupts,
> but it is via rcu_user_exit(), which would be a very odd choice for
> cpuidle_enter_state().
> 
> It seems unlikely, but it might be that it is the use of local_irq_save()
> instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> the trouble.  If this is the case, then the commit shown below would
> help.  Note that this commit removes the warning from lockdep, so it
> is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> equivalent debugging.
> 
> Could you please try your test with the -rce commit shown below applied?

Thanks for looking into it.

After checking out Peter's commit 32d4fd5751ea,
cherry picking your commit ed4ae5eff4b3,
and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
I am now seeing this a few seconds into the boot:

[3.010650] [ cut here ]
[3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 
sched_clock_tick+0x27/0x60
[3.010657] Modules linked in:
[3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.19.0-rc1-test-5-g1be22fea0611 #1
[3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
[3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
[3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 
18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 
e2 6c 89 00 48 c7 c3 40 d5 02 00
 89 c0 48 03 1c c5 c0 98
[3.010667] RSP: :b2803e28 EFLAGS: 00010002
[3.010670] RAX: 0001 RBX: c8ce7fa07060 RCX: 0001
[3.010671] RDX:  RSI: b268dd21 RDI: b269ab13
[3.010673] RBP: 0001 R08: ffc300d5 R09: 0002be80
[3.010674] R10: 03625b53183a R11: a012b802b7a4 R12: b2aa9e80
[3.010675] R13: b2aa9e00 R14: 0001 R15: 
[3.010677] FS:  () GS:a012b800() 
knlGS:
[3.010678] CS:  0010 DS:  ES:  CR0: 80050033
[3.010680] CR2: a012f81ff000 CR3: 000c99612001 CR4: 003706f0
[3.010681] DR0:  DR1:  DR2: 
[3.010682] DR3:  DR6: fffe0ff0 DR7: 0400
[3.010683] Call Trace:
[3.010685]  
[3.010688]  cpuidle_enter_state+0xb7/0x4b0
[3.010694]  cpuidle_enter+0x29/0x40
[3.010697]  do_idle+0x1d4/0x210
[3.010702]  cpu_startup_entry+0x19/0x20
[3.010704]  rest_init+0x117/0x1a0
[3.010708]  arch_call_rest_init+0xa/0x10
[3.010711]  start_kernel+0x6d8/0x6ff
[3.010716]  secondary_startup_64_no_verify+0xce/0xdb
[3.010728]  
[3.010729] irq event stamp: 44179
[3.010730] hardirqs last  enabled at (44179): [] 
asm_sysvec_apic_timer_interrupt+0x1b/0x20
[3.010734] hardirqs last disabled at (44177): [] 
__do_softirq+0x3f0/0x498
[3.010736] softirqs last  enabled at (44178): [] 
__do_softirq+0x332/0x498
[3.010738] softirqs last disabled at (44171): [] 
irq_exit_rcu+0xab/0xf0
[3.010741] ---[ end trace  ]---


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-28 Thread Paul E. McKenney
On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> > 
> >  - must not have tracing in idle functions
> >  - must return with IRQs disabled
> > 
> > Additionally, it added a branch for no good reason.
> > 
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) 
> 
> After this change was introduced, I am seeing "WARNING: suspicious RCU
> usage" when booting a kernel with debug options compiled in. Please
> see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> and is still present in v5.19-rc8.
> 
> I'm not sure, is this too late to fix or revert in v5.19 final ?

I finally got a chance to take a quick look at this.

The rcu_eqs_exit() function is making a lockdep complaint about
being invoked with interrupts enabled.  This function is called from
rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
via its call to rcu_idle_exit().  Except that rcu_idle_exit() disables
interrupts before invoking rcu_eqs_exit().

The only other call to rcu_idle_exit() does not disable interrupts,
but it is via rcu_user_exit(), which would be a very odd choice for
cpuidle_enter_state().

It seems unlikely, but it might be that it is the use of local_irq_save()
instead of raw_local_irq_save() within rcu_idle_exit() that is causing
the trouble.  If this is the case, then the commit shown below would
help.  Note that this commit removes the warning from lockdep, so it
is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
equivalent debugging.

Could you please try your test with the -rce commit shown below applied?

Thanx, Paul



commit ed4ae5eff4b38797607cbdd80da394149110fb37
Author: Paul E. McKenney 
Date:   Tue May 17 21:00:04 2022 -0700

rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()

This commit applies the "noinstr" tag to the rcu_idle_enter() and
rcu_idle_exit() functions, which are invoked from portions of the idle
loop that cannot be instrumented.  These tags require reworking the
rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
invoke in order to cause them to use normal assertions rather than
lockdep.  In addition, within rcu_idle_exit(), the raw versions of
local_irq_save() and local_irq_restore() are used, again to avoid issues
with lockdep in uninstrumented code.

This patch is based in part on an earlier patch by Jiri Olsa, discussions
with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
Gleixner, and off-list discussions with Yonghong Song.

Link: 
https://lore.kernel.org/lkml/20220515203653.4039075-1-jo...@kernel.org/
Reported-by: Jiri Olsa 
Reported-by: Alexei Starovoitov 
Reported-by: Andrii Nakryiko 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Yonghong Song 

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c25ba442044a6..9a5edab5558c9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -631,8 +631,8 @@ static noinstr void rcu_eqs_enter(bool user)
return;
}
 
-   lockdep_assert_irqs_disabled();
instrumentation_begin();
+   lockdep_assert_irqs_disabled();
trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, 
atomic_read(>dynticks));
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && 
!is_idle_task(current));
rcu_preempt_deferred_qs(current);
@@ -659,9 +659,9 @@ static noinstr void rcu_eqs_enter(bool user)
  * If you add or remove a call to rcu_idle_enter(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_enter(void)
+void noinstr rcu_idle_enter(void)
 {
-   lockdep_assert_irqs_disabled();
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rcu_eqs_enter(false);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -861,7 +861,7 @@ static void noinstr rcu_eqs_exit(bool user)
struct rcu_data *rdp;
long oldval;
 
-   lockdep_assert_irqs_disabled();
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
rdp = this_cpu_ptr(_data);
oldval = rdp->dynticks_nesting;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
@@ -896,13 +896,13 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
unsigned long flags;
 
-   local_irq_save(flags);
+   raw_local_irq_save(flags);
rcu_eqs_exit(false);
-   local_irq_restore(flags);
+   

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-07-25 Thread Michel Lespinasse
On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
> 
>  - must not have tracing in idle functions
>  - must return with IRQs disabled
> 
> Additionally, it added a branch for no good reason.
> 
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) 

After this change was introduced, I am seeing "WARNING: suspicious RCU
usage" when booting a kernel with debug options compiled in. Please
see the attached dmesg output. The issue starts with commit 32d4fd5751ea
and is still present in v5.19-rc8.

I'm not sure, is this too late to fix or revert in v5.19 final ?

Thanks,

--
Michel "walken" Lespinasse
[0.00] microcode: microcode updated early to revision 0x2006d05, date = 
2021-11-13
[0.00] Linux version 5.19.0-rc8-test-3-ge3a8d97e6a35 (walken@zeus) 
(gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP 
PREEMPT_DYNAMIC Mon Jul 25 00:32:16 PDT 2022
[0.00] Command line: 
BOOT_IMAGE=/vmlinuz-5.19.0-rc8-test-3-ge3a8d97e6a35 
root=/dev/mapper/budai--vg-root ro consoleblank=600 quiet
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   AMD AuthenticAMD
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
[0.00] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
[0.00] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
[0.00] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: xstate_offset[3]:  832, xstate_sizes[3]:   64
[0.00] x86/fpu: xstate_offset[4]:  896, xstate_sizes[4]:   64
[0.00] x86/fpu: xstate_offset[5]:  960, xstate_sizes[5]:   64
[0.00] x86/fpu: xstate_offset[6]: 1024, xstate_sizes[6]:  512
[0.00] x86/fpu: xstate_offset[7]: 1536, xstate_sizes[7]: 1024
[0.00] x86/fpu: Enabled xstate features 0xff, context size is 2560 
bytes, using 'compacted' format.
[0.00] signal: max sigframe size: 3632
[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0003efff] usable
[0.00] BIOS-e820: [mem 0x0003f000-0x0003] reserved
[0.00] BIOS-e820: [mem 0x0004-0x0009] usable
[0.00] BIOS-e820: [mem 0x000a-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x62a8efff] usable
[0.00] BIOS-e820: [mem 0x62a8f000-0x6aef4fff] reserved
[0.00] BIOS-e820: [mem 0x6aef5000-0x6b0c2fff] ACPI data
[0.00] BIOS-e820: [mem 0x6b0c3000-0x6be0dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0x6be0e000-0x6c7f1fff] reserved
[0.00] BIOS-e820: [mem 0x6c7f2000-0x6c8e5fff] type 20
[0.00] BIOS-e820: [mem 0x6c8e6000-0x6c8e6fff] reserved
[0.00] BIOS-e820: [mem 0x6c8e7000-0x6c9a7fff] type 20
[0.00] BIOS-e820: [mem 0x6c9a8000-0x6e6f] usable
[0.00] BIOS-e820: [mem 0x6e70-0x8fff] reserved
[0.00] BIOS-e820: [mem 0xfd00-0xfe010fff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved
[0.00] BIOS-e820: [mem 0xfed2-0xfed44fff] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00107fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.70 by American Megatrends
[0.00] efi: TPMFinalLog=0x6bd97000 ACPI=0x6b0c2000 ACPI 2.0=0x6b0c2014 
SMBIOS=0x6c47b000 SMBIOS 3.0=0x6c47a000 MEMATTR=0x5d04d018 ESRT=0x604e0718 
[0.00] SMBIOS 3.0.0 present.
[0.00] DMI: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
[0.00] tsc: Detected 3700.000 MHz processor
[0.00] tsc: Detected 3699.850 MHz TSC
[0.000278] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.000281] e820: remove [mem 0x000a-0x000f] usable
[0.000289] last_pfn = 0x108 max_arch_pfn = 0x4
[0.000397] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[0.001626] e820: update [mem 

Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-17 Thread Jacob Pan
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, 
linux-...@vger.kernel.org, j...@joshtriplett.org, rost...@goodmis.org, 
r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, dal...@libc.org, t...@atomide.com, 
amakha...@vmware.com, bjorn.anders...@linaro.org, h...@zytor.com, 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-ri...@lists.infradead.org, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, yury.no...@gmail.com, rich...@nod.at, x...@kernel.org, 
li...@armlinux.org.uk, mi...@redhat.com, a...@eecs.berkeley.edu, 
paul...@kernel.org, h...@linux.ibm.com, stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, paul.walms...@sifive.com, 
linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
mon...@monstr.eu, linux-m...@vger.kernel.org, palmer@dab
 belt.com, a...@brainfault.org, i...@jurassic.park.msu.ru, 
johan...@sipsolutions.net, linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


Hi Peter,

On Mon, 13 Jun 2022 10:44:22 +0200, Peter Zijlstra 
wrote:

> On Thu, Jun 09, 2022 at 04:49:21PM -0700, Jacob Pan wrote:
> > Hi Peter,
> > 
> > On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra
> >  wrote:
> >   
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > > 
> > >  - must not have tracing in idle functions
> > >  - must return with IRQs disabled
> > > 
> > > Additionally, it added a branch for no good reason.
> > > 
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") Signed-off-by: Peter Zijlstra (Intel) 
> > > ---
> > >  drivers/idle/intel_idle.c |   48
> > > +++--- 1 file changed, 37
> > > insertions(+), 11 deletions(-)
> > > 
> > > --- a/drivers/idle/intel_idle.c
> > > +++ b/drivers/idle/intel_idle.c
> > > @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
> > >   *
> > >   * Must be called under local_irq_disable().
> > >   */  
> > nit: this comment is no long true, right?  
> 
> It still is, all the idle routines are called with interrupts disabled,
> but must also exit with interrupts disabled.
> 
> If the idle method requires interrupts to be enabled, it must be sure to
> disable them again before returning. Given all the RCU/tracing concerns
> it must use raw_local_irq_*() for this though.
Makes sense, it is just little confusing when the immediate caller does
raw_local_irq_enable() which does not cancel out local_irq_disable().

Thanks,

Jacob


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-13 Thread Peter Zijlstra
, ulli.kr...@googlemail.com, vgu...@kernel.org, linux-...@vger.kernel.org, 
j...@joshtriplett.org, rost...@goodmis.org, r...@vger.kernel.org, 
b...@alien8.de, bc...@quicinc.com, tsbog...@alpha.franken.de, 
linux-par...@vger.kernel.org, sudeep.ho...@arm.com, shawn...@kernel.org, 
da...@davemloft.net, dal...@libc.org, t...@atomide.com, amakha...@vmware.com, 
bjorn.anders...@linaro.org, h...@zytor.com, sparcli...@vger.kernel.org, 
linux-hexa...@vger.kernel.org, linux-ri...@lists.infradead.org, 
anton.iva...@cambridgegreys.com, jo...@southpole.se, yury.no...@gmail.com, 
rich...@nod.at, x...@kernel.org, li...@armlinux.org.uk, mi...@redhat.com, 
a...@eecs.berkeley.edu, paul...@kernel.org, h...@linux.ibm.com, 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, 
paul.walms...@sifive.com, linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
mon...@monstr.eu, linux-m...@vger.kernel.org, pal...@dabbelt.com, 
a...@brainfault.org, 
 i...@jurassic.park.msu.ru, johan...@sipsolutions.net, 
linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Thu, Jun 09, 2022 at 04:49:21PM -0700, Jacob Pan wrote:
> Hi Peter,
> 
> On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra 
> wrote:
> 
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> > 
> >  - must not have tracing in idle functions
> >  - must return with IRQs disabled
> > 
> > Additionally, it added a branch for no good reason.
> > 
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) 
> > ---
> >  drivers/idle/intel_idle.c |   48
> > +++--- 1 file changed, 37
> > insertions(+), 11 deletions(-)
> > 
> > --- a/drivers/idle/intel_idle.c
> > +++ b/drivers/idle/intel_idle.c
> > @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
> >   *
> >   * Must be called under local_irq_disable().
> >   */
> nit: this comment is no long true, right?

It still is, all the idle routines are called with interrupts disabled,
but must also exit with interrupts disabled.

If the idle method requires interrupts to be enabled, it must be sure to
disable them again before returning. Given all the RCU/tracing concerns
it must use raw_local_irq_*() for this though.


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-10 Thread Jacob Pan
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, 
linux-...@vger.kernel.org, j...@joshtriplett.org, rost...@goodmis.org, 
r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, dal...@libc.org, t...@atomide.com, 
amakha...@vmware.com, bjorn.anders...@linaro.org, h...@zytor.com, 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-ri...@lists.infradead.org, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, yury.no...@gmail.com, rich...@nod.at, x...@kernel.org, 
li...@armlinux.org.uk, mi...@redhat.com, a...@eecs.berkeley.edu, 
paul...@kernel.org, h...@linux.ibm.com, stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, paul.walms...@sifive.com, 
linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
mon...@monstr.eu, linux-m...@vger.kernel.org, palmer@dab
 belt.com, a...@brainfault.org, i...@jurassic.park.msu.ru, 
johan...@sipsolutions.net, linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


Hi Peter,

On Wed, 08 Jun 2022 16:27:27 +0200, Peter Zijlstra 
wrote:

> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
> 
>  - must not have tracing in idle functions
>  - must return with IRQs disabled
> 
> Additionally, it added a branch for no good reason.
> 
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  drivers/idle/intel_idle.c |   48
> +++--- 1 file changed, 37
> insertions(+), 11 deletions(-)
> 
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
>   *
>   * Must be called under local_irq_disable().
>   */
nit: this comment is no long true, right?

> +
> -static __cpuidle int intel_idle(struct cpuidle_device *dev,
> - struct cpuidle_driver *drv, int index)
> +static __always_inline int __intel_idle(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int
> index) {
>   struct cpuidle_state *state = >states[index];
>   unsigned long eax = flg2MWAIT(state->flags);
>   unsigned long ecx = 1; /* break on interrupt flag */
>  
> - if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> - local_irq_enable();
> -
>   mwait_idle_with_hints(eax, ecx);
>  
>   return index;
>  }
>  
> +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int index)
> +{
> + return __intel_idle(dev, drv, index);
> +}
> +
> +static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv, int
> index) +{
> + int ret;
> +
> + raw_local_irq_enable();
> + ret = __intel_idle(dev, drv, index);
> + raw_local_irq_disable();
> +
> + return ret;
> +}
> +
>  /**
>   * intel_idle_s2idle - Ask the processor to enter the given idle state.
>   * @dev: cpuidle device of the target CPU.
> @@ -1801,6 +1824,9 @@ static void __init intel_idle_init_cstat
>   /* Structure copy. */
>   drv->states[drv->state_count] =
> cpuidle_state_table[cstate]; 
> + if (cpuidle_state_table[cstate].flags &
> CPUIDLE_FLAG_IRQ_ENABLE)
> + drv->states[drv->state_count].enter =
> intel_idle_irq; +
>   if ((disabled_states_mask & BIT(drv->state_count)) ||
>   ((icpu->use_acpi || force_use_acpi) &&
>intel_idle_off_by_default(mwait_hint) &&
> 
> 


Thanks,

Jacob


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-08 Thread Rafael J. Wysocki
ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, ch...@zankel.net, Stephen Boyd 
, dingu...@kernel.org, Daniel Bristot de Oliveira 
, Alexander Shishkin , 
lpieral...@kernel.org, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, linux-clk 
, Josh Triplett , Steven 
Rostedt , r...@vger.kernel.org, Borislav Petkov 
, bc...@quicinc.com, Thomas Bogendoerfer 
, Parisc List , Sudeep 
Holla , Shawn Guo , David Miller 
, Rich Felker , Tony Lindgren 
, amakha...@vmware.com, Bjorn Andersson 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, linux-riscv 
, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "P
 aul E. McKenney" , Heiko Carstens
 , stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, Paul Walmsley , 
linux-tegra , namhy...@kernel.org, Andy Shevchenko 
, jpoim...@kernel.org, Juergen Gross 
, Michal Simek , "open list:BROADCOM NVRAM 
DRIVER" , Palmer Dabbelt , Anup 
Patel , i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 8, 2022 at 5:48 PM Peter Zijlstra  wrote:
>
> On Wed, Jun 08, 2022 at 05:01:05PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
> > >
> > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > Xeons") wrecked intel_idle in two ways:
> > >
> > >  - must not have tracing in idle functions
> > >  - must return with IRQs disabled
> > >
> > > Additionally, it added a branch for no good reason.
> > >
> > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > Signed-off-by: Peter Zijlstra (Intel) 
> >
> > Acked-by: Rafael J. Wysocki 
> >
> > And do I think correctly that this can be applied without the rest of
> > the series?
>
> Yeah, I don't think this relies on any of the preceding patches. If you
> want to route this through the pm/fixes tree that's fine.

OK, thanks, applied (and I moved the intel_idle() kerneldoc so it is
next to the function to avoid the docs build warning).


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-08 Thread Peter Zijlstra
el.org>, Len Brown , linux-xte...@linux-xtensa.org, Sascha 
Hauer , Vasily Gorbik , 
linux-arm-msm , linux-al...@vger.kernel.org, 
linux-m68k , Stafford Horne 
, Linux ARM , 
ch...@zankel.net, Stephen Boyd , dingu...@kernel.org, Daniel 
Bristot de Oliveira , Alexander Shishkin 
, Michael Turquette 
, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , 
ulli.kr...@googlemail.com, vgupta@ker
 nel.org, linux-clk , Josh Triplett 
, Steven Rostedt , 
r...@vger.kernel.org, Borislav Petkov , bc...@quicinc.com, 
Thomas Bogendoerfer , Parisc List 
, Sudeep Holla , Shawn Guo 
, David Miller , Rich Felker 
, Tony Lindgren , amakha...@vmware.com, 
Bjorn Andersson , "H. Peter Anvin" 
, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-riscv , anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "Paul E. McKenney" , He
 iko Carstens , stefan.kristiansson
@saunalahti.fi, openr...@lists.librecores.org, Paul Walmsley 
, linux-tegra , 
namhy...@kernel.org, Andy Shevchenko , 
jpoim...@kernel.org, Juergen Gross , Michal Simek 
, "open list:BROADCOM NVRAM DRIVER" 
, Palmer Dabbelt , Anup Patel 
, i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 08, 2022 at 05:01:05PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
> >
> > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > Xeons") wrecked intel_idle in two ways:
> >
> >  - must not have tracing in idle functions
> >  - must return with IRQs disabled
> >
> > Additionally, it added a branch for no good reason.
> >
> > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > Signed-off-by: Peter Zijlstra (Intel) 
> 
> Acked-by: Rafael J. Wysocki 
> 
> And do I think correctly that this can be applied without the rest of
> the series?

Yeah, I don't think this relies on any of the preceding patches. If you
want to route this through the pm/fixes tree that's fine.

Thanks!


Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

2022-06-08 Thread Rafael J. Wysocki
ieu.desnoy...@efficios.com>, Frederic Weisbecker , Len 
Brown , linux-xte...@linux-xtensa.org, Sascha Hauer 
, Vasily Gorbik , linux-arm-msm 
, linux-al...@vger.kernel.org, linux-m68k 
, Stafford Horne , Linux ARM 
, ch...@zankel.net, Stephen Boyd 
, dingu...@kernel.org, Daniel Bristot de Oliveira 
, Alexander Shishkin , 
lpieral...@kernel.org, Rasmus Villemoes , Joel 
Fernandes , Will Deacon , Boris 
Ostrovsky , Kevin Hilman , 
linux-c...@vger.kernel.org, pv-driv...@vmware.com, 
linux-snps-...@lists.infradead.org, Mel Gorman , Jacob Pan 
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, linux-clk 
, Josh Triplett , Steven 
Rostedt , r...@vger.kernel.org, Borislav Petkov 
, bc...@quicinc.com, Thomas Bogendoerfer 
, Parisc List , Sudeep 
Holla , Shawn Guo , David Miller 
, Rich Felker , Tony Lindgren 
, amakha...@vmware.com, Bjorn Andersson 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, linux-riscv 
, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Yury Norov , Richard Weinberger 
, the arch/x86 maintainers , Russell King - 
ARM Linux , Ingo Molnar , Albert Ou 
, "P
 aul E. McKenney" , Heiko Carstens
 , stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, Paul Walmsley , 
linux-tegra , namhy...@kernel.org, Andy Shevchenko 
, jpoim...@kernel.org, Juergen Gross 
, Michal Simek , "open list:BROADCOM NVRAM 
DRIVER" , Palmer Dabbelt , Anup 
Patel , i...@jurassic.park.msu.ru, Johannes Berg 
, linuxppc-dev 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 8, 2022 at 4:47 PM Peter Zijlstra  wrote:
>
> Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> Xeons") wrecked intel_idle in two ways:
>
>  - must not have tracing in idle functions
>  - must return with IRQs disabled
>
> Additionally, it added a branch for no good reason.
>
> Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Peter Zijlstra (Intel) 

Acked-by: Rafael J. Wysocki 

And do I think correctly that this can be applied without the rest of
the series?

> ---
>  drivers/idle/intel_idle.c |   48 
> +++---
>  1 file changed, 37 insertions(+), 11 deletions(-)
>
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -129,21 +137,37 @@ static unsigned int mwait_substates __in
>   *
>   * Must be called under local_irq_disable().
>   */
> +
> -static __cpuidle int intel_idle(struct cpuidle_device *dev,
> -   struct cpuidle_driver *drv, int index)
> +static __always_inline int __intel_idle(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
>  {
> struct cpuidle_state *state = >states[index];
> unsigned long eax = flg2MWAIT(state->flags);
> unsigned long ecx = 1; /* break on interrupt flag */
>
> -   if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> -   local_irq_enable();
> -
> mwait_idle_with_hints(eax, ecx);
>
> return index;
>  }
>
> +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
> +{
> +   return __intel_idle(dev, drv, index);
> +}
> +
> +static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
> +   struct cpuidle_driver *drv, int index)
> +{
> +   int ret;
> +
> +   raw_local_irq_enable();
> +   ret = __intel_idle(dev, drv, index);
> +   raw_local_irq_disable();
> +
> +   return ret;
> +}
> +
>  /**
>   * intel_idle_s2idle - Ask the processor to enter the given idle state.
>   * @dev: cpuidle device of the target CPU.
> @@ -1801,6 +1824,9 @@ static void __init intel_idle_init_cstat
> /* Structure copy. */
> drv->states[drv->state_count] = cpuidle_state_table[cstate];
>
> +   if (cpuidle_state_table[cstate].flags & 
> CPUIDLE_FLAG_IRQ_ENABLE)
> +   drv->states[drv->state_count].enter = intel_idle_irq;
> +
> if ((disabled_states_mask & BIT(drv->state_count)) ||
> ((icpu->use_acpi || force_use_acpi) &&
>  intel_idle_off_by_default(mwait_hint) &&
>
>