Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-10-13 Thread Wanpeng Li
2016-10-12 21:17 GMT+08:00 Thomas Gleixner :
> On Wed, 12 Oct 2016, Wanpeng Li wrote:
>> irq_enter() which is called in scheduler_ipi() is too late to tell RCU
>> susbstems to end the extended quiescent state before ack_APIC_irq(),
>> any ideas?
>
> You can call irq_enter/exit() in __smp_reschedule_interrupt(). It can be
> called nested.

In smp_reschedule_interrupt()? Do you mean something like this?

diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 658777c..ac2ee87 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -259,8 +259,10 @@ static inline void __smp_reschedule_interrupt(void)

 __visible void smp_reschedule_interrupt(struct pt_regs *regs)
 {
+   irq_enter();
ack_APIC_irq();
__smp_reschedule_interrupt();
+   irq_exit();
/*
 * KVM uses this interrupt to force a cpu out of guest mode
 */


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-10-12 Thread Thomas Gleixner
On Wed, 12 Oct 2016, Wanpeng Li wrote:
> irq_enter() which is called in scheduler_ipi() is too late to tell RCU
> susbstems to end the extended quiescent state before ack_APIC_irq(),
> any ideas?

You can call irq_enter/exit() in __smp_reschedule_interrupt(). It can be
called nested.

Thanks,

tglx


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-10-12 Thread Wanpeng Li
2016-09-19 16:10 GMT+08:00 Peter Zijlstra :
> On Thu, Sep 15, 2016 at 10:58:04AM +0200, Thomas Gleixner wrote:
>> On Thu, 15 Sep 2016, Wanpeng Li wrote:
>> > ---
>> >  arch/x86/include/asm/apic.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> > index 1243577..71c1fe2 100644
>> > --- a/arch/x86/include/asm/apic.h
>> > +++ b/arch/x86/include/asm/apic.h
>> > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
>> >
>> >  static inline void ipi_entering_ack_irq(void)
>> >  {
>> > -   ack_APIC_irq();
>> > irq_enter();
>> > +   ack_APIC_irq();
>> >  }
>>
>> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
>> therefor pointless.
>
> entering_ack_irq() seems to use entering_irq() instead of irq_enter().
> Which is close but not the same. This thing seems to also do
> exit_idle().
>
> Now, there's only a handfull of ipi_entering_ack_irq() users, and it
> doesn't seem to make sense to me to only call exit_idle() on IPIs, why
> don't we need to call exit_idle() on regular IRQs ?!
>
> All in all, that stuff is crufty and needs a cleanup I'd say.

[  116.587762]
[  116.587768] ===
[  116.587770] [ INFO: suspicious RCU usage. ]
[  116.587773] 4.8.0+ #24 Not tainted
[  116.587775] ---
[  116.58] ./arch/x86/include/asm/msr-trace.h:47 suspicious
rcu_dereference_check() usage!
[  116.587779]
[  116.587779] other info that might help us debug this:
[  116.587779]
[  116.587782]
[  116.587782] RCU used illegally from idle CPU!
[  116.587782] rcu_scheduler_active = 1, debug_locks = 0
[  116.587785] RCU used illegally from extended quiescent state!
[  116.587787] no locks held by swapper/1/0.
[  116.587788]
[  116.587788] stack backtrace:
[  116.587792] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0+ #24
[  116.587794] Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03
01/08/2015
[  116.587796]  90285de03f58 9d44a0c9 90285ca5d100
0001
[  116.587803]  90285de03f88 9d0ebd67 902845165410
080b
[  116.587809]    90285de03fb8
9d492b95
[  116.587814] Call Trace:
[  116.587817][] dump_stack+0x99/0xd0
[  116.587827]  [] lockdep_rcu_suspicious+0xe7/0x120
[  116.587832]  [] do_trace_write_msr+0x135/0x140
[  116.587836]  [] native_write_msr+0x20/0x30
[  116.587841]  [] native_apic_msr_eoi_write+0x1d/0x30
[  116.587845]  [] smp_reschedule_interrupt+0x1d/0x30
[  116.587849]  [] reschedule_interrupt+0x96/0xa0
[  116.587851][] ? cpuidle_enter_state+0xe4/0x360
[  116.587858]  [] ? cpuidle_enter_state+0xcf/0x360
[  116.587861]  [] cpuidle_enter+0x17/0x20
[  116.587865]  [] call_cpuidle+0x23/0x50
[  116.587868]  [] cpu_startup_entry+0x15c/0x280
[  116.587872]  [] start_secondary+0x154/0x180

irq_enter() which is called in scheduler_ipi() is too late to tell RCU
susbstems to end the extended quiescent state before ack_APIC_irq(),
any ideas?

Regards,
Wanpeng Li


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-09-19 Thread Peter Zijlstra
On Thu, Sep 15, 2016 at 10:58:04AM +0200, Thomas Gleixner wrote:
> On Thu, 15 Sep 2016, Wanpeng Li wrote:
> > ---
> >  arch/x86/include/asm/apic.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> > index 1243577..71c1fe2 100644
> > --- a/arch/x86/include/asm/apic.h
> > +++ b/arch/x86/include/asm/apic.h
> > @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
> >  
> >  static inline void ipi_entering_ack_irq(void)
> >  {
> > -   ack_APIC_irq();
> > irq_enter();
> > +   ack_APIC_irq();
> >  }
> 
> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
> therefor pointless.

entering_ack_irq() seems to use entering_irq() instead of irq_enter().
Which is close but not the same. This thing seems to also do
exit_idle().

Now, there's only a handfull of ipi_entering_ack_irq() users, and it
doesn't seem to make sense to me to only call exit_idle() on IPIs, why
don't we need to call exit_idle() on regular IRQs ?!

All in all, that stuff is crufty and needs a cleanup I'd say.


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-09-18 Thread Wanpeng Li
2016-09-15 16:58 GMT+08:00 Thomas Gleixner :
> On Thu, 15 Sep 2016, Wanpeng Li wrote:
>> ---
>>  arch/x86/include/asm/apic.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> index 1243577..71c1fe2 100644
>> --- a/arch/x86/include/asm/apic.h
>> +++ b/arch/x86/include/asm/apic.h
>> @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
>>
>>  static inline void ipi_entering_ack_irq(void)
>>  {
>> - ack_APIC_irq();
>>   irq_enter();
>> + ack_APIC_irq();
>>  }
>
> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
> therefor pointless.

They are still similar before this fix.
https://lkml.org/lkml/2016/3/18/332 I just sent out v2 for both
ipi_entering_ack_irq() and exiting_ack_irq().

Regards,
Wanpeng Li


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-09-18 Thread Wanpeng Li
2016-09-15 16:58 GMT+08:00 Thomas Gleixner :
> On Thu, 15 Sep 2016, Wanpeng Li wrote:
>> ---
>>  arch/x86/include/asm/apic.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
>> index 1243577..71c1fe2 100644
>> --- a/arch/x86/include/asm/apic.h
>> +++ b/arch/x86/include/asm/apic.h
>> @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
>>
>>  static inline void ipi_entering_ack_irq(void)
>>  {
>> - ack_APIC_irq();
>>   irq_enter();
>> + ack_APIC_irq();
>>  }
>
> which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
> therefor pointless.
>
> Looking further we have the same issue in exiting_ack_irq() where we call
> ack_APIC_irq() after irq_exit().

Do you think I should fix both ipi_entering_ack_irq() and exiting_ack_irq()?

Regards,
Wanpeng Li


Re: [PATCH] x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt

2016-09-15 Thread Thomas Gleixner
On Thu, 15 Sep 2016, Wanpeng Li wrote:
> ---
>  arch/x86/include/asm/apic.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 1243577..71c1fe2 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -650,8 +650,8 @@ static inline void entering_ack_irq(void)
>  
>  static inline void ipi_entering_ack_irq(void)
>  {
> - ack_APIC_irq();
>   irq_enter();
> + ack_APIC_irq();
>  }

which makes ipi_entering_ack_irq() the same as entering_ack_irq() and
therefor pointless.

Looking further we have the same issue in exiting_ack_irq() where we call
ack_APIC_irq() after irq_exit().

Thanks,

tglx