Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Benjamin Herrenschmidt
On Mon, 2017-02-20 at 14:04 -0800, Thomas Gleixner wrote:
> > HOWEVER. Looking at current upstream code I don't understand the error,
> > the DEBUG_SHIRQ code is calling the driver's handler not the flow
> > handler so it shouldn't be called handle_fasteoi_irq or am I missing
> > something ?
> 
> I tried to invoke the normal handler path which also invokes the flow
> handler, but that breaks on x86 as well for different reasons. I zapped
> that commit and still need to find a way to do that debug thing proper. So
> it's appearence in -next was only temporary.

Ok I see. Yes I wouldn't be surprised if we aren't the only ones to
expect that one get_irq() matches *one* invocation of the flow handler.

We had to hack around this for irq_replay already but at least we have
a hook to do that.

You could possibly use replay, but what's wrong with what the code
currently does which is to just call the driver handler directly ?

Cheers,
Ben.



Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Benjamin Herrenschmidt
On Mon, 2017-02-20 at 14:04 -0800, Thomas Gleixner wrote:
> > HOWEVER. Looking at current upstream code I don't understand the error,
> > the DEBUG_SHIRQ code is calling the driver's handler not the flow
> > handler so it shouldn't be called handle_fasteoi_irq or am I missing
> > something ?
> 
> I tried to invoke the normal handler path which also invokes the flow
> handler, but that breaks on x86 as well for different reasons. I zapped
> that commit and still need to find a way to do that debug thing proper. So
> it's appearence in -next was only temporary.

Ok I see. Yes I wouldn't be surprised if we aren't the only ones to
expect that one get_irq() matches *one* invocation of the flow handler.

We had to hack around this for irq_replay already but at least we have
a hook to do that.

You could possibly use replay, but what's wrong with what the code
currently does which is to just call the driver handler directly ?

Cheers,
Ben.



Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Thomas Gleixner
On Tue, 21 Feb 2017, Benjamin Herrenschmidt wrote:

> On Mon, 2017-02-20 at 21:55 +1100, Michael Ellerman wrote:
> > But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
> > precisely because we're faking an interrupt.
> > 
> > I'm not sure if there's a good way to fix it :/
> 
> In the irq_replay path we have code to adjust the CPPR stack. We could
> do something similar.
> 
> HOWEVER. Looking at current upstream code I don't understand the error,
> the DEBUG_SHIRQ code is calling the driver's handler not the flow
> handler so it shouldn't be called handle_fasteoi_irq or am I missing
> something ?

I tried to invoke the normal handler path which also invokes the flow
handler, but that breaks on x86 as well for different reasons. I zapped
that commit and still need to find a way to do that debug thing proper. So
it's appearence in -next was only temporary.

Thanks,

tglx


Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Thomas Gleixner
On Tue, 21 Feb 2017, Benjamin Herrenschmidt wrote:

> On Mon, 2017-02-20 at 21:55 +1100, Michael Ellerman wrote:
> > But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
> > precisely because we're faking an interrupt.
> > 
> > I'm not sure if there's a good way to fix it :/
> 
> In the irq_replay path we have code to adjust the CPPR stack. We could
> do something similar.
> 
> HOWEVER. Looking at current upstream code I don't understand the error,
> the DEBUG_SHIRQ code is calling the driver's handler not the flow
> handler so it shouldn't be called handle_fasteoi_irq or am I missing
> something ?

I tried to invoke the normal handler path which also invokes the flow
handler, but that breaks on x86 as well for different reasons. I zapped
that commit and still need to find a way to do that debug thing proper. So
it's appearence in -next was only temporary.

Thanks,

tglx


Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Benjamin Herrenschmidt
On Mon, 2017-02-20 at 21:55 +1100, Michael Ellerman wrote:
> But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
> precisely because we're faking an interrupt.
> 
> I'm not sure if there's a good way to fix it :/

In the irq_replay path we have code to adjust the CPPR stack. We could
do something similar.

HOWEVER. Looking at current upstream code I don't understand the error,
the DEBUG_SHIRQ code is calling the driver's handler not the flow
handler so it shouldn't be called handle_fasteoi_irq or am I missing
something ?

Cheers,
Ben.



Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Benjamin Herrenschmidt
On Mon, 2017-02-20 at 21:55 +1100, Michael Ellerman wrote:
> But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
> precisely because we're faking an interrupt.
> 
> I'm not sure if there's a good way to fix it :/

In the irq_replay path we have code to adjust the CPPR stack. We could
do something similar.

HOWEVER. Looking at current upstream code I don't understand the error,
the DEBUG_SHIRQ code is calling the driver's handler not the flow
handler so it shouldn't be called handle_fasteoi_irq or am I missing
something ?

Cheers,
Ben.



Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Michael Ellerman
Sachin Sant  writes:

>>> While booting next-20170217 on a POWER6 box, I ran into following
>>> warning. This is a full system lpar. Previous next tree was good.
>>> I will try a bisect tomorrow.
>> 
>> Do you have CONFIG_DEBUG_SHIRQ=y ?
>> 
>
> Yes. CONFIG_DEBUG_SHIRQ is enabled.
>
> As suggested by you reverting following commit allows a clean boot.
> f91f694540f3 ("genirq: Reenable shared irq debugging in request_*_irq()”)

OK. Or disabling CONFIG_DEBUG_SHIRQ :)

The problem is that the xics code saves the CPPR value in get_irq(),
called from __do_irq(), and then restores it in irq_eoi().

But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
precisely because we're faking an interrupt.

I'm not sure if there's a good way to fix it :/

cheers

>>> ipr: IBM Power RAID SCSI Device Driver version: 2.6.3 (October 17, 2015)
>>> ipr 0200:00:01.0: Found IOA with IRQ: 305
>>> [ cut here ]
>>> WARNING: CPU: 12 PID: 1 at ./arch/powerpc/include/asm/xics.h:124 
>>> .icp_hv_eoi+0x40/0x140
>>> Modules linked in:
>>> CPU: 12 PID: 1 Comm: swapper/14 Not tainted 
>>> 4.10.0-rc8-next-20170217-autotest #1
>>> task: c002b2a4a580 task.stack: c002b2a5c000
>>> NIP: c00731b0 LR: c01389f8 CTR: c0073170
>>> REGS: c002b2a5f050 TRAP: 0700   Not tainted  
>>> (4.10.0-rc8-next-20170217-autotest)
>>> MSR: 80029032 
>>>  CR: 28004082  XER: 2004
>>> CFAR: c01389e0 SOFTE: 0 
>>> GPR00: c01389f8 c002b2a5f2d0 c1025800 c002b203f498 
>>> GPR04:   0064 0131 
>>> GPR08: 0001 c000d3104cb8  0009b1f8 
>>> GPR12: 48004082 cedc2400 c000dad0  
>>> GPR16:  3c007efc c0a9e848  
>>> GPR20: d8008008 c002af4d47f0 c11efda8 c0a9ea10 
>>> GPR24: c0a9e848  c002af4d4fb8  
>>> GPR28:  c002b203f498 c0ef8928 c002b203f400 
>>> NIP [c00731b0] .icp_hv_eoi+0x40/0x140
>>> LR [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>>> Call Trace:
>>> [c002b2a5f2d0] [c002b2a5f360] 0xc002b2a5f360 (unreliable)
>>> [c002b2a5f360] [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>>> [c002b2a5f3e0] [c0136a08] .request_threaded_irq+0x298/0x370
>>> [c002b2a5f490] [c05895c0] .ipr_probe_ioa+0x1110/0x1390
>>> [c002b2a5f5c0] [c058d030] .ipr_probe+0x30/0x3e0
>>> [c002b2a5f670] [c0466860] .local_pci_probe+0x60/0x130
>>> [c002b2a5f710] [c0467658] .pci_device_probe+0x148/0x1e0
>>> [c002b2a5f7c0] [c0527524] .driver_probe_device+0x2d4/0x5b0
>>> [c002b2a5f860] [c052796c] .__driver_attach+0x16c/0x190
>>> [c002b2a5f8f0] [c05242c4] .bus_for_each_dev+0x84/0xf0
>>> [c002b2a5f990] [c0526af4] .driver_attach+0x24/0x40
>>> [c002b2a5fa00] [c0526318] .bus_add_driver+0x2a8/0x370
>>> [c002b2a5faa0] [c0528a5c] .driver_register+0x8c/0x170
>>> [c002b2a5fb20] [c0465a54] .__pci_register_driver+0x44/0x60
>>> [c002b2a5fb90] [c0b8efc8] .ipr_init+0x58/0x70
>>> [c002b2a5fc10] [c000d20c] .do_one_initcall+0x5c/0x1c0
>>> [c002b2a5fce0] [c0b44738] .kernel_init_freeable+0x280/0x360
>>> [c002b2a5fdb0] [c000daec] .kernel_init+0x1c/0x130
>>> [c002b2a5fe30] [c000baa0] .ret_from_kernel_thread+0x58/0xb8
>>> Instruction dump:
>>> f8010010 f821ff71 80e3000c 7c0004ac e94d0030 3d02ffbc 3928f4b8 7d295214 
>>> 81090004 3948 7d484378 79080fe2 <0b08> 2fa8 40de0050 91490004 
>>> ---[ end trace 5e18ae409f46392c ]---
>>> ipr 0200:00:01.0: Initializing IOA.
>>> 
>>> Thanks
>>> -Sachin
>> 


Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-20 Thread Michael Ellerman
Sachin Sant  writes:

>>> While booting next-20170217 on a POWER6 box, I ran into following
>>> warning. This is a full system lpar. Previous next tree was good.
>>> I will try a bisect tomorrow.
>> 
>> Do you have CONFIG_DEBUG_SHIRQ=y ?
>> 
>
> Yes. CONFIG_DEBUG_SHIRQ is enabled.
>
> As suggested by you reverting following commit allows a clean boot.
> f91f694540f3 ("genirq: Reenable shared irq debugging in request_*_irq()”)

OK. Or disabling CONFIG_DEBUG_SHIRQ :)

The problem is that the xics code saves the CPPR value in get_irq(),
called from __do_irq(), and then restores it in irq_eoi().

But when we're called for CONFIG_DEBUG_SHIRQ get_irq() is not called,
precisely because we're faking an interrupt.

I'm not sure if there's a good way to fix it :/

cheers

>>> ipr: IBM Power RAID SCSI Device Driver version: 2.6.3 (October 17, 2015)
>>> ipr 0200:00:01.0: Found IOA with IRQ: 305
>>> [ cut here ]
>>> WARNING: CPU: 12 PID: 1 at ./arch/powerpc/include/asm/xics.h:124 
>>> .icp_hv_eoi+0x40/0x140
>>> Modules linked in:
>>> CPU: 12 PID: 1 Comm: swapper/14 Not tainted 
>>> 4.10.0-rc8-next-20170217-autotest #1
>>> task: c002b2a4a580 task.stack: c002b2a5c000
>>> NIP: c00731b0 LR: c01389f8 CTR: c0073170
>>> REGS: c002b2a5f050 TRAP: 0700   Not tainted  
>>> (4.10.0-rc8-next-20170217-autotest)
>>> MSR: 80029032 
>>>  CR: 28004082  XER: 2004
>>> CFAR: c01389e0 SOFTE: 0 
>>> GPR00: c01389f8 c002b2a5f2d0 c1025800 c002b203f498 
>>> GPR04:   0064 0131 
>>> GPR08: 0001 c000d3104cb8  0009b1f8 
>>> GPR12: 48004082 cedc2400 c000dad0  
>>> GPR16:  3c007efc c0a9e848  
>>> GPR20: d8008008 c002af4d47f0 c11efda8 c0a9ea10 
>>> GPR24: c0a9e848  c002af4d4fb8  
>>> GPR28:  c002b203f498 c0ef8928 c002b203f400 
>>> NIP [c00731b0] .icp_hv_eoi+0x40/0x140
>>> LR [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>>> Call Trace:
>>> [c002b2a5f2d0] [c002b2a5f360] 0xc002b2a5f360 (unreliable)
>>> [c002b2a5f360] [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>>> [c002b2a5f3e0] [c0136a08] .request_threaded_irq+0x298/0x370
>>> [c002b2a5f490] [c05895c0] .ipr_probe_ioa+0x1110/0x1390
>>> [c002b2a5f5c0] [c058d030] .ipr_probe+0x30/0x3e0
>>> [c002b2a5f670] [c0466860] .local_pci_probe+0x60/0x130
>>> [c002b2a5f710] [c0467658] .pci_device_probe+0x148/0x1e0
>>> [c002b2a5f7c0] [c0527524] .driver_probe_device+0x2d4/0x5b0
>>> [c002b2a5f860] [c052796c] .__driver_attach+0x16c/0x190
>>> [c002b2a5f8f0] [c05242c4] .bus_for_each_dev+0x84/0xf0
>>> [c002b2a5f990] [c0526af4] .driver_attach+0x24/0x40
>>> [c002b2a5fa00] [c0526318] .bus_add_driver+0x2a8/0x370
>>> [c002b2a5faa0] [c0528a5c] .driver_register+0x8c/0x170
>>> [c002b2a5fb20] [c0465a54] .__pci_register_driver+0x44/0x60
>>> [c002b2a5fb90] [c0b8efc8] .ipr_init+0x58/0x70
>>> [c002b2a5fc10] [c000d20c] .do_one_initcall+0x5c/0x1c0
>>> [c002b2a5fce0] [c0b44738] .kernel_init_freeable+0x280/0x360
>>> [c002b2a5fdb0] [c000daec] .kernel_init+0x1c/0x130
>>> [c002b2a5fe30] [c000baa0] .ret_from_kernel_thread+0x58/0xb8
>>> Instruction dump:
>>> f8010010 f821ff71 80e3000c 7c0004ac e94d0030 3d02ffbc 3928f4b8 7d295214 
>>> 81090004 3948 7d484378 79080fe2 <0b08> 2fa8 40de0050 91490004 
>>> ---[ end trace 5e18ae409f46392c ]---
>>> ipr 0200:00:01.0: Initializing IOA.
>>> 
>>> Thanks
>>> -Sachin
>> 


Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-19 Thread Sachin Sant

>> While booting next-20170217 on a POWER6 box, I ran into following
>> warning. This is a full system lpar. Previous next tree was good.
>> I will try a bisect tomorrow.
> 
> Do you have CONFIG_DEBUG_SHIRQ=y ?
> 

Yes. CONFIG_DEBUG_SHIRQ is enabled.

As suggested by you reverting following commit allows a clean boot.
f91f694540f3 ("genirq: Reenable shared irq debugging in request_*_irq()”)

>> ipr: IBM Power RAID SCSI Device Driver version: 2.6.3 (October 17, 2015)
>> ipr 0200:00:01.0: Found IOA with IRQ: 305
>> [ cut here ]
>> WARNING: CPU: 12 PID: 1 at ./arch/powerpc/include/asm/xics.h:124 
>> .icp_hv_eoi+0x40/0x140
>> Modules linked in:
>> CPU: 12 PID: 1 Comm: swapper/14 Not tainted 
>> 4.10.0-rc8-next-20170217-autotest #1
>> task: c002b2a4a580 task.stack: c002b2a5c000
>> NIP: c00731b0 LR: c01389f8 CTR: c0073170
>> REGS: c002b2a5f050 TRAP: 0700   Not tainted  
>> (4.10.0-rc8-next-20170217-autotest)
>> MSR: 80029032 
>>  CR: 28004082  XER: 2004
>> CFAR: c01389e0 SOFTE: 0 
>> GPR00: c01389f8 c002b2a5f2d0 c1025800 c002b203f498 
>> GPR04:   0064 0131 
>> GPR08: 0001 c000d3104cb8  0009b1f8 
>> GPR12: 48004082 cedc2400 c000dad0  
>> GPR16:  3c007efc c0a9e848  
>> GPR20: d8008008 c002af4d47f0 c11efda8 c0a9ea10 
>> GPR24: c0a9e848  c002af4d4fb8  
>> GPR28:  c002b203f498 c0ef8928 c002b203f400 
>> NIP [c00731b0] .icp_hv_eoi+0x40/0x140
>> LR [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>> Call Trace:
>> [c002b2a5f2d0] [c002b2a5f360] 0xc002b2a5f360 (unreliable)
>> [c002b2a5f360] [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>> [c002b2a5f3e0] [c0136a08] .request_threaded_irq+0x298/0x370
>> [c002b2a5f490] [c05895c0] .ipr_probe_ioa+0x1110/0x1390
>> [c002b2a5f5c0] [c058d030] .ipr_probe+0x30/0x3e0
>> [c002b2a5f670] [c0466860] .local_pci_probe+0x60/0x130
>> [c002b2a5f710] [c0467658] .pci_device_probe+0x148/0x1e0
>> [c002b2a5f7c0] [c0527524] .driver_probe_device+0x2d4/0x5b0
>> [c002b2a5f860] [c052796c] .__driver_attach+0x16c/0x190
>> [c002b2a5f8f0] [c05242c4] .bus_for_each_dev+0x84/0xf0
>> [c002b2a5f990] [c0526af4] .driver_attach+0x24/0x40
>> [c002b2a5fa00] [c0526318] .bus_add_driver+0x2a8/0x370
>> [c002b2a5faa0] [c0528a5c] .driver_register+0x8c/0x170
>> [c002b2a5fb20] [c0465a54] .__pci_register_driver+0x44/0x60
>> [c002b2a5fb90] [c0b8efc8] .ipr_init+0x58/0x70
>> [c002b2a5fc10] [c000d20c] .do_one_initcall+0x5c/0x1c0
>> [c002b2a5fce0] [c0b44738] .kernel_init_freeable+0x280/0x360
>> [c002b2a5fdb0] [c000daec] .kernel_init+0x1c/0x130
>> [c002b2a5fe30] [c000baa0] .ret_from_kernel_thread+0x58/0xb8
>> Instruction dump:
>> f8010010 f821ff71 80e3000c 7c0004ac e94d0030 3d02ffbc 3928f4b8 7d295214 
>> 81090004 3948 7d484378 79080fe2 <0b08> 2fa8 40de0050 91490004 
>> ---[ end trace 5e18ae409f46392c ]---
>> ipr 0200:00:01.0: Initializing IOA.
>> 
>> Thanks
>> -Sachin
> 



Re: [next-20170217] WARN @/arch/powerpc/include/asm/xics.h:124 .icp_hv_eoi+0x40/0x140

2017-02-19 Thread Sachin Sant

>> While booting next-20170217 on a POWER6 box, I ran into following
>> warning. This is a full system lpar. Previous next tree was good.
>> I will try a bisect tomorrow.
> 
> Do you have CONFIG_DEBUG_SHIRQ=y ?
> 

Yes. CONFIG_DEBUG_SHIRQ is enabled.

As suggested by you reverting following commit allows a clean boot.
f91f694540f3 ("genirq: Reenable shared irq debugging in request_*_irq()”)

>> ipr: IBM Power RAID SCSI Device Driver version: 2.6.3 (October 17, 2015)
>> ipr 0200:00:01.0: Found IOA with IRQ: 305
>> [ cut here ]
>> WARNING: CPU: 12 PID: 1 at ./arch/powerpc/include/asm/xics.h:124 
>> .icp_hv_eoi+0x40/0x140
>> Modules linked in:
>> CPU: 12 PID: 1 Comm: swapper/14 Not tainted 
>> 4.10.0-rc8-next-20170217-autotest #1
>> task: c002b2a4a580 task.stack: c002b2a5c000
>> NIP: c00731b0 LR: c01389f8 CTR: c0073170
>> REGS: c002b2a5f050 TRAP: 0700   Not tainted  
>> (4.10.0-rc8-next-20170217-autotest)
>> MSR: 80029032 
>>  CR: 28004082  XER: 2004
>> CFAR: c01389e0 SOFTE: 0 
>> GPR00: c01389f8 c002b2a5f2d0 c1025800 c002b203f498 
>> GPR04:   0064 0131 
>> GPR08: 0001 c000d3104cb8  0009b1f8 
>> GPR12: 48004082 cedc2400 c000dad0  
>> GPR16:  3c007efc c0a9e848  
>> GPR20: d8008008 c002af4d47f0 c11efda8 c0a9ea10 
>> GPR24: c0a9e848  c002af4d4fb8  
>> GPR28:  c002b203f498 c0ef8928 c002b203f400 
>> NIP [c00731b0] .icp_hv_eoi+0x40/0x140
>> LR [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>> Call Trace:
>> [c002b2a5f2d0] [c002b2a5f360] 0xc002b2a5f360 (unreliable)
>> [c002b2a5f360] [c01389f8] .handle_fasteoi_irq+0x1e8/0x270
>> [c002b2a5f3e0] [c0136a08] .request_threaded_irq+0x298/0x370
>> [c002b2a5f490] [c05895c0] .ipr_probe_ioa+0x1110/0x1390
>> [c002b2a5f5c0] [c058d030] .ipr_probe+0x30/0x3e0
>> [c002b2a5f670] [c0466860] .local_pci_probe+0x60/0x130
>> [c002b2a5f710] [c0467658] .pci_device_probe+0x148/0x1e0
>> [c002b2a5f7c0] [c0527524] .driver_probe_device+0x2d4/0x5b0
>> [c002b2a5f860] [c052796c] .__driver_attach+0x16c/0x190
>> [c002b2a5f8f0] [c05242c4] .bus_for_each_dev+0x84/0xf0
>> [c002b2a5f990] [c0526af4] .driver_attach+0x24/0x40
>> [c002b2a5fa00] [c0526318] .bus_add_driver+0x2a8/0x370
>> [c002b2a5faa0] [c0528a5c] .driver_register+0x8c/0x170
>> [c002b2a5fb20] [c0465a54] .__pci_register_driver+0x44/0x60
>> [c002b2a5fb90] [c0b8efc8] .ipr_init+0x58/0x70
>> [c002b2a5fc10] [c000d20c] .do_one_initcall+0x5c/0x1c0
>> [c002b2a5fce0] [c0b44738] .kernel_init_freeable+0x280/0x360
>> [c002b2a5fdb0] [c000daec] .kernel_init+0x1c/0x130
>> [c002b2a5fe30] [c000baa0] .ret_from_kernel_thread+0x58/0xb8
>> Instruction dump:
>> f8010010 f821ff71 80e3000c 7c0004ac e94d0030 3d02ffbc 3928f4b8 7d295214 
>> 81090004 3948 7d484378 79080fe2 <0b08> 2fa8 40de0050 91490004 
>> ---[ end trace 5e18ae409f46392c ]---
>> ipr 0200:00:01.0: Initializing IOA.
>> 
>> Thanks
>> -Sachin
>