Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-11-16 Thread Michael Ellerman
Abdul Haleem  writes:
> On Fri, 2018-11-16 at 15:44 +1100, Michael Ellerman wrote:
>> Abdul Haleem  writes:
>> > On Thu, 2018-11-15 at 16:40 +0530, Abdul Haleem wrote:
>> >> On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
>> >> > On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
>> >> > > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
>> >> > >  wrote:
>> >> > > > Greeting's
>> >> > > >
>> >> > > > bnx2x module load/unload test results in continuous hard LOCKUP 
>> >> > > > trace on
>> >> > > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
>> ^^
>> >> 
>> >> Warnings also show up on 4.20.0-rc2-next-20181114
>> >
>> > We have a patch fix available https://patchwork.ozlabs.org/patch/998054/
>> >
>> > It fixed the problem.
>> 
>> But the bug it fixes wasn't present in 4.19.0-rc4, which is the version
>> you originally reported against. Or is that version string not accurate?
>
> Yes, version string wrong. the bug was first seen on linux-next
> 4.19.0-rc3-next-20180913 and on mainline version 4.19.0-rc8

OK. The commit that the above patch fixes was first in next-20181015, or
mainline as of v4.20-rc1~24.

cheers


Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-11-15 Thread Abdul Haleem
On Fri, 2018-11-16 at 15:44 +1100, Michael Ellerman wrote:
> Abdul Haleem  writes:
> > On Thu, 2018-11-15 at 16:40 +0530, Abdul Haleem wrote:
> >> On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
> >> > On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
> >> > > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
> >> > >  wrote:
> >> > > > Greeting's
> >> > > >
> >> > > > bnx2x module load/unload test results in continuous hard LOCKUP 
> >> > > > trace on
> >> > > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
> ^^
> >> 
> >> Warnings also show up on 4.20.0-rc2-next-20181114
> >
> > We have a patch fix available https://patchwork.ozlabs.org/patch/998054/
> >
> > It fixed the problem.
> 
> But the bug it fixes wasn't present in 4.19.0-rc4, which is the version
> you originally reported against. Or is that version string not accurate?

Yes, version string wrong. the bug was first seen on linux-next
4.19.0-rc3-next-20180913 and on mainline version 4.19.0-rc8

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-11-15 Thread Michael Ellerman
Abdul Haleem  writes:
> On Thu, 2018-11-15 at 16:40 +0530, Abdul Haleem wrote:
>> On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
>> > On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
>> > > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
>> > >  wrote:
>> > > > Greeting's
>> > > >
>> > > > bnx2x module load/unload test results in continuous hard LOCKUP trace 
>> > > > on
>> > > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
^^
>> 
>> Warnings also show up on 4.20.0-rc2-next-20181114
>
> We have a patch fix available https://patchwork.ozlabs.org/patch/998054/
>
> It fixed the problem.

But the bug it fixes wasn't present in 4.19.0-rc4, which is the version
you originally reported against. Or is that version string not accurate?

cheers


Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-11-15 Thread Abdul Haleem
On Thu, 2018-11-15 at 16:40 +0530, Abdul Haleem wrote:
> On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
> > On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
> > > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
> > >  wrote:
> > > > Greeting's
> > > >
> > > > bnx2x module load/unload test results in continuous hard LOCKUP trace on
> > > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
> > > >
> > > > the instruction address points to:
> > > >
> > > > 0xc009d048 is in opal_interrupt
> > > > (arch/powerpc/platforms/powernv/opal-irqchip.c:133).
> > > > 128
> > > > 129 static irqreturn_t opal_interrupt(int irq, void *data)
> > > > 130 {
> > > > 131 __be64 events;
> > > > 132
> > > > 133 opal_handle_interrupt(virq_to_hw(irq), );
> > > > 134 last_outstanding_events = be64_to_cpu(events);
> > > > 135 if (opal_have_pending_events())
> > > > 136 opal_wake_poller();
> > > > 137
> > > >
> > > > trace:
> > > > bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
> > > > bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... 
> > > > fp[7] 306
> > > > bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, 
> > > > Flow control: none
> > > > bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, 
> > > > Flow control: none
> > > > bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 
> > > > 1.712.30-0 (2014/02/10)
> > > > bnx2x 0008:01:00.0: msix capability found
> > > > bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
> > > > bnx2x 0008:01:00.0: part number 0-0-0-0
> > > > bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 
> > > > link)
> > > > bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
> > > > bnx2x 0008:01:00.1: msix capability found
> > > > bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
> > > > bnx2x 0008:01:00.1: part number 0-0-0-0
> > > > bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... 
> > > > fp[7] 276
> > > > bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 1 Mbps full duplex, 
> > > > Flow control: ON - receive & transmit
> > > > bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 
> > > > link)
> > > > bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
> > > > bnx2x 0008:01:00.2: msix capability found
> > > > bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
> > > > bnx2x 0008:01:00.2: part number 0-0-0-0
> > > > bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... 
> > > > fp[7] 286
> > > > bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 1 Mbps full duplex, 
> > > > Flow control: ON - receive & transmit
> > > 
> > > 
> > > > watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
> > > > watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 
> > > > (13348ms ago)
> > > 
> > > Ouch, 13 seconds in OPAL. Looks like we trip the hard lockup detector
> > > once the thread comes back into the kernel so we're not completely
> > > stuck. At a guess there's some contention on a lock in OPAL due to the
> > > bind/unbind loop, but i'm not sure why that would be happening.
> > > 
> > > Can you give us a copy of the OPAL log? /sys/firmware/opal/msglog)
> > 
> > Oliver, thanks for looking into this, I have sent a private mail (file
> > was 1MB) with logs attached.
> > 
> 
> Oliver, any luck on the logs given.
> 
> Warnings also show up on 4.20.0-rc2-next-20181114

We have a patch fix available https://patchwork.ozlabs.org/patch/998054/

It fixed the problem.

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-11-15 Thread Abdul Haleem
On Mon, 2018-09-24 at 15:49 +0530, Abdul Haleem wrote:
> On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
> > On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
> >  wrote:
> > > Greeting's
> > >
> > > bnx2x module load/unload test results in continuous hard LOCKUP trace on
> > > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
> > >
> > > the instruction address points to:
> > >
> > > 0xc009d048 is in opal_interrupt
> > > (arch/powerpc/platforms/powernv/opal-irqchip.c:133).
> > > 128
> > > 129 static irqreturn_t opal_interrupt(int irq, void *data)
> > > 130 {
> > > 131 __be64 events;
> > > 132
> > > 133 opal_handle_interrupt(virq_to_hw(irq), );
> > > 134 last_outstanding_events = be64_to_cpu(events);
> > > 135 if (opal_have_pending_events())
> > > 136 opal_wake_poller();
> > > 137
> > >
> > > trace:
> > > bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
> > > bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... 
> > > fp[7] 306
> > > bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, 
> > > Flow control: none
> > > bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, 
> > > Flow control: none
> > > bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 
> > > (2014/02/10)
> > > bnx2x 0008:01:00.0: msix capability found
> > > bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.0: part number 0-0-0-0
> > > bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > > bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
> > > bnx2x 0008:01:00.1: msix capability found
> > > bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.1: part number 0-0-0-0
> > > bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... 
> > > fp[7] 276
> > > bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 1 Mbps full duplex, 
> > > Flow control: ON - receive & transmit
> > > bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > > bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
> > > bnx2x 0008:01:00.2: msix capability found
> > > bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
> > > bnx2x 0008:01:00.2: part number 0-0-0-0
> > > bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... 
> > > fp[7] 286
> > > bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 1 Mbps full duplex, 
> > > Flow control: ON - receive & transmit
> > 
> > 
> > > watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
> > > watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms 
> > > ago)
> > 
> > Ouch, 13 seconds in OPAL. Looks like we trip the hard lockup detector
> > once the thread comes back into the kernel so we're not completely
> > stuck. At a guess there's some contention on a lock in OPAL due to the
> > bind/unbind loop, but i'm not sure why that would be happening.
> > 
> > Can you give us a copy of the OPAL log? /sys/firmware/opal/msglog)
> 
> Oliver, thanks for looking into this, I have sent a private mail (file
> was 1MB) with logs attached.
> 

Oliver, any luck on the logs given.

Warnings also show up on 4.20.0-rc2-next-20181114

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-09-24 Thread Abdul Haleem
On Mon, 2018-09-24 at 19:35 +1000, Oliver wrote:
> On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
>  wrote:
> > Greeting's
> >
> > bnx2x module load/unload test results in continuous hard LOCKUP trace on
> > my powerpc bare-metal running mainline 4.19.0-rc4 kernel
> >
> > the instruction address points to:
> >
> > 0xc009d048 is in opal_interrupt
> > (arch/powerpc/platforms/powernv/opal-irqchip.c:133).
> > 128
> > 129 static irqreturn_t opal_interrupt(int irq, void *data)
> > 130 {
> > 131 __be64 events;
> > 132
> > 133 opal_handle_interrupt(virq_to_hw(irq), );
> > 134 last_outstanding_events = be64_to_cpu(events);
> > 135 if (opal_have_pending_events())
> > 136 opal_wake_poller();
> > 137
> >
> > trace:
> > bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
> > bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... 
> > fp[7] 306
> > bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, Flow 
> > control: none
> > bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, Flow 
> > control: none
> > bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 
> > (2014/02/10)
> > bnx2x 0008:01:00.0: msix capability found
> > bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
> > bnx2x 0008:01:00.0: part number 0-0-0-0
> > bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
> > bnx2x 0008:01:00.1: msix capability found
> > bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
> > bnx2x 0008:01:00.1: part number 0-0-0-0
> > bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... 
> > fp[7] 276
> > bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 1 Mbps full duplex, Flow 
> > control: ON - receive & transmit
> > bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> > bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
> > bnx2x 0008:01:00.2: msix capability found
> > bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
> > bnx2x 0008:01:00.2: part number 0-0-0-0
> > bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... 
> > fp[7] 286
> > bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 1 Mbps full duplex, Flow 
> > control: ON - receive & transmit
> 
> 
> > watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
> > watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms 
> > ago)
> 
> Ouch, 13 seconds in OPAL. Looks like we trip the hard lockup detector
> once the thread comes back into the kernel so we're not completely
> stuck. At a guess there's some contention on a lock in OPAL due to the
> bind/unbind loop, but i'm not sure why that would be happening.
> 
> Can you give us a copy of the OPAL log? /sys/firmware/opal/msglog)

Oliver, thanks for looking into this, I have sent a private mail (file
was 1MB) with logs attached.

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: [mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-09-24 Thread Oliver
On Mon, Sep 24, 2018 at 6:56 PM, Abdul Haleem
 wrote:
> Greeting's
>
> bnx2x module load/unload test results in continuous hard LOCKUP trace on
> my powerpc bare-metal running mainline 4.19.0-rc4 kernel
>
> the instruction address points to:
>
> 0xc009d048 is in opal_interrupt
> (arch/powerpc/platforms/powernv/opal-irqchip.c:133).
> 128
> 129 static irqreturn_t opal_interrupt(int irq, void *data)
> 130 {
> 131 __be64 events;
> 132
> 133 opal_handle_interrupt(virq_to_hw(irq), );
> 134 last_outstanding_events = be64_to_cpu(events);
> 135 if (opal_have_pending_events())
> 136 opal_wake_poller();
> 137
>
> trace:
> bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
> bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... fp[7] 
> 306
> bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, Flow 
> control: none
> bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, Flow 
> control: none
> bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 
> (2014/02/10)
> bnx2x 0008:01:00.0: msix capability found
> bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
> bnx2x 0008:01:00.0: part number 0-0-0-0
> bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
> bnx2x 0008:01:00.1: msix capability found
> bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
> bnx2x 0008:01:00.1: part number 0-0-0-0
> bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... fp[7] 
> 276
> bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 1 Mbps full duplex, Flow 
> control: ON - receive & transmit
> bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
> bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
> bnx2x 0008:01:00.2: msix capability found
> bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
> bnx2x 0008:01:00.2: part number 0-0-0-0
> bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... fp[7] 
> 286
> bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 1 Mbps full duplex, Flow 
> control: ON - receive & transmit


> watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
> watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms ago)

Ouch, 13 seconds in OPAL. Looks like we trip the hard lockup detector
once the thread comes back into the kernel so we're not completely
stuck. At a guess there's some contention on a lock in OPAL due to the
bind/unbind loop, but i'm not sure why that would be happening.

Can you give us a copy of the OPAL log? /sys/firmware/opal/msglog)

> Modules linked in: bnx2x(+) iptable_mangle ipt_MASQUERADE iptable_nat
> nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv4 ipt_REJECT
> nf_reject_ipv4 xt_tcpudp tun bridge stp llc iptable_filter dm_mirror
> dm_region_hash dm_log dm_service_time vmx_crypto powernv_rng rng_core
> dm_multipath kvm_hv kvm binfmt_misc nfsd ip_tables x_tables autofs4 xfs
> lpfc crc_t10dif crct10dif_generic nvme_fc nvme_fabrics mdio libcrc32c
> nvme_core crct10dif_common [last unloaded: bnx2x]
> CPU: 80 PID: 0 Comm: swapper/80 Not tainted 4.19.0-rc4-autotest-autotest #1
> NIP:  c009d048 LR: c0092fd0 CTR: 30032a00
> REGS: c03fff493d80 TRAP: 0900   Not tainted  
> (4.19.0-rc4-autotest-autotest)
> MSR:  90009033   CR: 48004042  XER: 
> CFAR: c0092fbc IRQMASK: 1
> GPR00: 30005128 c03fff70f220 c10ae500 
> GPR04: 48004042 c009d048 90009033 0090
> GPR08:   c0092fe4 90001003
> GPR12: c0092fbc c03fff7ff300 c03c96c80c00 0001
> GPR16:  003c c03c96c80800 c03c96d00700
> GPR20: 0001 0001 0002 0014
> GPR24: c01fe8741000 c03fff70f330  c03ca947fb40
> GPR28: 092f47d0 0014 c01fe8741000 c01fe9860200
> NIP [c009d048] opal_interrupt+0x28/0x70
> LR [c0092fd0] opal_return+0x14/0x48
> Call Trace:
> [c03fff70f220] [c009d048] opal_interrupt+0x28/0x70 (unreliable)
> [c03fff70f250] [c016d890] __handle_irq_event_percpu+0x90/0x2d0
> [c03fff70f310] [c016db00] handle_irq_event_percpu+0x30/0x90
> [c03fff70f350] [c016dbc0] handle_irq_event+0x60/0xc0
> [c03fff70f380] [c0172d2c] handle_fasteoi_irq+0xbc/0x1f0
> [c03fff70f3b0] [c016c084] generic_handle_irq+0x44/0x70
> [c03fff70f3d0] [c00193cc] __do_irq+0x8c/0x200
> [c03fff70f440] [c0019640] do_IRQ+0x100/0x110
> [c03fff70f490] [c0008db8] hardware_interrupt_common+0x158/0x160
> --- interrupt: 501 at fib_table_lookup+0xfc/0x600
> LR = fib_validate_source+0x148/0x370
> [c03fff70f780] []   

[mainline][ppc][bnx2x] watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70 when module load/unload

2018-09-24 Thread Abdul Haleem
Greeting's

bnx2x module load/unload test results in continuous hard LOCKUP trace on
my powerpc bare-metal running mainline 4.19.0-rc4 kernel

the instruction address points to:

0xc009d048 is in opal_interrupt
(arch/powerpc/platforms/powernv/opal-irqchip.c:133).
128 
129 static irqreturn_t opal_interrupt(int irq, void *data)
130 {
131 __be64 events;
132 
133 opal_handle_interrupt(virq_to_hw(irq), );
134 last_outstanding_events = be64_to_cpu(events);
135 if (opal_have_pending_events())
136 opal_wake_poller();
137 

trace:
bnx2x 0008:01:00.3 enP8p1s0f3: renamed from eth0
bnx2x 0008:01:00.3 enP8p1s0f3: using MSI-X  IRQs: sp 297  fp[0] 299 ... fp[7] 
306
bnx2x 0008:01:00.2 enP8p1s0f2: NIC Link is Up, 1000 Mbps full duplex, Flow 
control: none
bnx2x 0008:01:00.3 enP8p1s0f3: NIC Link is Up, 1000 Mbps full duplex, Flow 
control: none
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 
(2014/02/10)
bnx2x 0008:01:00.0: msix capability found
bnx2x 0008:01:00.0: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.0: part number 0-0-0-0
bnx2x 0008:01:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.0 enP8p1s0f0: renamed from eth0
bnx2x 0008:01:00.1: msix capability found
bnx2x 0008:01:00.1: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.1: part number 0-0-0-0
bnx2x 0008:01:00.0 enP8p1s0f0: using MSI-X  IRQs: sp 267  fp[0] 269 ... fp[7] 
276
bnx2x 0008:01:00.0 enP8p1s0f0: NIC Link is Up, 1 Mbps full duplex, Flow 
control: ON - receive & transmit
bnx2x 0008:01:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
bnx2x 0008:01:00.1 enP8p1s0f1: renamed from eth0
bnx2x 0008:01:00.2: msix capability found
bnx2x 0008:01:00.2: Using 64-bit DMA iommu bypass
bnx2x 0008:01:00.2: part number 0-0-0-0
bnx2x 0008:01:00.1 enP8p1s0f1: using MSI-X  IRQs: sp 277  fp[0] 279 ... fp[7] 
286
bnx2x 0008:01:00.1 enP8p1s0f1: NIC Link is Up, 1 Mbps full duplex, Flow 
control: ON - receive & transmit
watchdog: CPU 80 self-detected hard LOCKUP @ opal_interrupt+0x28/0x70
watchdog: CPU 80 TB:980794111093, last heartbeat TB:973959617200 (13348ms ago)
Modules linked in: bnx2x(+) iptable_mangle ipt_MASQUERADE iptable_nat
nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv4 ipt_REJECT
nf_reject_ipv4 xt_tcpudp tun bridge stp llc iptable_filter dm_mirror
dm_region_hash dm_log dm_service_time vmx_crypto powernv_rng rng_core
dm_multipath kvm_hv kvm binfmt_misc nfsd ip_tables x_tables autofs4 xfs
lpfc crc_t10dif crct10dif_generic nvme_fc nvme_fabrics mdio libcrc32c
nvme_core crct10dif_common [last unloaded: bnx2x]
CPU: 80 PID: 0 Comm: swapper/80 Not tainted 4.19.0-rc4-autotest-autotest #1
NIP:  c009d048 LR: c0092fd0 CTR: 30032a00
REGS: c03fff493d80 TRAP: 0900   Not tainted  (4.19.0-rc4-autotest-autotest)
MSR:  90009033   CR: 48004042  XER: 
CFAR: c0092fbc IRQMASK: 1 
GPR00: 30005128 c03fff70f220 c10ae500  
GPR04: 48004042 c009d048 90009033 0090 
GPR08:   c0092fe4 90001003 
GPR12: c0092fbc c03fff7ff300 c03c96c80c00 0001 
GPR16:  003c c03c96c80800 c03c96d00700 
GPR20: 0001 0001 0002 0014 
GPR24: c01fe8741000 c03fff70f330  c03ca947fb40 
GPR28: 092f47d0 0014 c01fe8741000 c01fe9860200 
NIP [c009d048] opal_interrupt+0x28/0x70
LR [c0092fd0] opal_return+0x14/0x48
Call Trace:
[c03fff70f220] [c009d048] opal_interrupt+0x28/0x70 (unreliable)
[c03fff70f250] [c016d890] __handle_irq_event_percpu+0x90/0x2d0
[c03fff70f310] [c016db00] handle_irq_event_percpu+0x30/0x90
[c03fff70f350] [c016dbc0] handle_irq_event+0x60/0xc0
[c03fff70f380] [c0172d2c] handle_fasteoi_irq+0xbc/0x1f0
[c03fff70f3b0] [c016c084] generic_handle_irq+0x44/0x70
[c03fff70f3d0] [c00193cc] __do_irq+0x8c/0x200
[c03fff70f440] [c0019640] do_IRQ+0x100/0x110
[c03fff70f490] [c0008db8] hardware_interrupt_common+0x158/0x160
--- interrupt: 501 at fib_table_lookup+0xfc/0x600
LR = fib_validate_source+0x148/0x370
[c03fff70f780] []   (null) (unreliable)
[c03fff70f7e0] [c0959af8] fib_validate_source+0x148/0x370
[c03fff70f8a0] [c08fd664] ip_route_input_rcu+0x214/0x970
[c03fff70f990] [c08fdde0] ip_route_input_noref+0x20/0x30
[c03fff70f9e0] [c0945e28] arp_process.constprop.14+0x3d8/0x8a0
[c03fff70faf0] [c089eb20] __netif_receive_skb_one_core+0x60/0x80
[c03fff70fb30] [c08a7d00] netif_receive_skb_internal+0x30/0x110
[c03fff70fb70] [c08a888c] napi_gro_receive+0x11c/0x1c0
[c03fff70fbb0]