Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
2016-10-12 21:17 GMT+08:00 Thomas Gleixner : > On Wed, 12 Oct 2016, Wanpeng Li wrote: >> irq_enter() which is called in scheduler_ipi() is too late to tell RCU >> susbstems to end the extended quiescent state before ack_APIC_irq(), >> any ideas? > > You can call irq_enter/exit() in __smp_reschedule_interrupt(). It can be > called nested. In smp_reschedule_interrupt()? Do you mean something like this? diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 658777c..ac2ee87 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -259,8 +259,10 @@ static inline void __smp_reschedule_interrupt(void) __visible void smp_reschedule_interrupt(struct pt_regs *regs) { + irq_enter(); ack_APIC_irq(); __smp_reschedule_interrupt(); + irq_exit(); /* * KVM uses this interrupt to force a cpu out of guest mode */
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
On Wed, 12 Oct 2016, Wanpeng Li wrote: > irq_enter() which is called in scheduler_ipi() is too late to tell RCU > susbstems to end the extended quiescent state before ack_APIC_irq(), > any ideas? You can call irq_enter/exit() in __smp_reschedule_interrupt(). It can be called nested. Thanks, tglx
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
2016-09-19 16:10 GMT+08:00 Peter Zijlstra : > On Thu, Sep 15, 2016 at 10:58:04AM +0200, Thomas Gleixner wrote: >> On Thu, 15 Sep 2016, Wanpeng Li wrote: >> > --- >> > arch/x86/include/asm/apic.h | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h >> > index 1243577..71c1fe2 100644 >> > --- a/arch/x86/include/asm/apic.h >> > +++ b/arch/x86/include/asm/apic.h >> > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void) >> > >> > static inline void ipi_entering_ack_irq(void) >> > { >> > - ack_APIC_irq(); >> > irq_enter(); >> > + ack_APIC_irq(); >> > } >> >> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and >> therefor pointless. > > entering_ack_irq() seems to use entering_irq() instead of irq_enter(). > Which is close but not the same. This thing seems to also do > exit_idle(). > > Now, there's only a handfull of ipi_entering_ack_irq() users, and it > doesn't seem to make sense to me to only call exit_idle() on IPIs, why > don't we need to call exit_idle() on regular IRQs ?! > > All in all, that stuff is crufty and needs a cleanup I'd say. [ 116.587762] [ 116.587768] === [ 116.587770] [ INFO: suspicious RCU usage. ] [ 116.587773] 4.8.0+ #24 Not tainted [ 116.587775] --- [ 116.58] ./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage! [ 116.587779] [ 116.587779] other info that might help us debug this: [ 116.587779] [ 116.587782] [ 116.587782] RCU used illegally from idle CPU! [ 116.587782] rcu_scheduler_active = 1, debug_locks = 0 [ 116.587785] RCU used illegally from extended quiescent state! [ 116.587787] no locks held by swapper/1/0. [ 116.587788] [ 116.587788] stack backtrace: [ 116.587792] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #24 [ 116.587794] Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015 [ 116.587796] 90285de03f58 9d44a0c9 90285ca5d100 0001 [ 116.587803] 90285de03f88 9d0ebd67 902845165410 080b [ 116.587809] 90285de03fb8 9d492b95 [ 116.587814] Call Trace: [ 116.587817][] dump_stack+0x99/0xd0 [ 116.587827] [] lockdep_rcu_suspicious+0xe7/0x120 [ 116.587832] [] do_trace_write_msr+0x135/0x140 [ 116.587836] [] native_write_msr+0x20/0x30 [ 116.587841] [] native_apic_msr_eoi_write+0x1d/0x30 [ 116.587845] [] smp_reschedule_interrupt+0x1d/0x30 [ 116.587849] [] reschedule_interrupt+0x96/0xa0 [ 116.587851][] ? cpuidle_enter_state+0xe4/0x360 [ 116.587858] [] ? cpuidle_enter_state+0xcf/0x360 [ 116.587861] [] cpuidle_enter+0x17/0x20 [ 116.587865] [] call_cpuidle+0x23/0x50 [ 116.587868] [] cpu_startup_entry+0x15c/0x280 [ 116.587872] [] start_secondary+0x154/0x180 irq_enter() which is called in scheduler_ipi() is too late to tell RCU susbstems to end the extended quiescent state before ack_APIC_irq(), any ideas? Regards, Wanpeng Li
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
On Thu, Sep 15, 2016 at 10:58:04AM +0200, Thomas Gleixner wrote: > On Thu, 15 Sep 2016, Wanpeng Li wrote: > > --- > > arch/x86/include/asm/apic.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > > index 1243577..71c1fe2 100644 > > --- a/arch/x86/include/asm/apic.h > > +++ b/arch/x86/include/asm/apic.h > > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void) > > > > static inline void ipi_entering_ack_irq(void) > > { > > - ack_APIC_irq(); > > irq_enter(); > > + ack_APIC_irq(); > > } > > which makes ipi_entering_ack_irq() the same as entering_ack_irq() and > therefor pointless. entering_ack_irq() seems to use entering_irq() instead of irq_enter(). Which is close but not the same. This thing seems to also do exit_idle(). Now, there's only a handfull of ipi_entering_ack_irq() users, and it doesn't seem to make sense to me to only call exit_idle() on IPIs, why don't we need to call exit_idle() on regular IRQs ?! All in all, that stuff is crufty and needs a cleanup I'd say.
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
2016-09-15 16:58 GMT+08:00 Thomas Gleixner : > On Thu, 15 Sep 2016, Wanpeng Li wrote: >> --- >> arch/x86/include/asm/apic.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h >> index 1243577..71c1fe2 100644 >> --- a/arch/x86/include/asm/apic.h >> +++ b/arch/x86/include/asm/apic.h >> @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void) >> >> static inline void ipi_entering_ack_irq(void) >> { >> - ack_APIC_irq(); >> irq_enter(); >> + ack_APIC_irq(); >> } > > which makes ipi_entering_ack_irq() the same as entering_ack_irq() and > therefor pointless. They are still similar before this fix. https://lkml.org/lkml/2016/3/18/332 I just sent out v2 for both ipi_entering_ack_irq() and exiting_ack_irq(). Regards, Wanpeng Li
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
2016-09-15 16:58 GMT+08:00 Thomas Gleixner : > On Thu, 15 Sep 2016, Wanpeng Li wrote: >> --- >> arch/x86/include/asm/apic.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h >> index 1243577..71c1fe2 100644 >> --- a/arch/x86/include/asm/apic.h >> +++ b/arch/x86/include/asm/apic.h >> @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void) >> >> static inline void ipi_entering_ack_irq(void) >> { >> - ack_APIC_irq(); >> irq_enter(); >> + ack_APIC_irq(); >> } > > which makes ipi_entering_ack_irq() the same as entering_ack_irq() and > therefor pointless. > > Looking further we have the same issue in exiting_ack_irq() where we call > ack_APIC_irq() after irq_exit(). Do you think I should fix both ipi_entering_ack_irq() and exiting_ack_irq()? Regards, Wanpeng Li
Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt
On Thu, 15 Sep 2016, Wanpeng Li wrote: > --- > arch/x86/include/asm/apic.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > index 1243577..71c1fe2 100644 > --- a/arch/x86/include/asm/apic.h > +++ b/arch/x86/include/asm/apic.h > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void) > > static inline void ipi_entering_ack_irq(void) > { > - ack_APIC_irq(); > irq_enter(); > + ack_APIC_irq(); > } which makes ipi_entering_ack_irq() the same as entering_ack_irq() and therefor pointless. Looking further we have the same issue in exiting_ack_irq() where we call ack_APIC_irq() after irq_exit(). Thanks, tglx