[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
For documentation purposes, that commit eventually made mainline (a30c2a3bf8571c6748dd16edc10b32d45ed71a72). Note the issue could not be reproduced w/ 14.04.3 anyway. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Fix Released Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Changed in: linux (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Fix Released Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [sdc]
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
--- Comment From cha...@us.ibm.com 2015-07-27 19:43 EDT--- In Ubuntu 14.04.3 with kernel 3.19.0-22, this issue is not seen. EEH happens as expected, and adapter recovery is working fine till 5th time. 6th time, after reboot also, it is recovered. # uname -a Linux powerio-le21 3.19.0-22-generic #22~14.04.1-Ubuntu SMP Wed Jun 17 10:03:39 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Closing... ** Tags removed: targetmilestone-inin1504 ** Tags added: targetmilestone-inin14043 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
@arges I'll check if qlogic may review/reply to continue some activity. I'm not experienced w/ the review/commit process for this subsystem, so if it's someone else who should reply, please let me know. Thanks for your attention on this one. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
I haven't seen upstream take the patch yet, does it need to be resent? Thanks -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Changed in: linux (Ubuntu) Status: New => Confirmed ** Changed in: linux (Ubuntu) Importance: Undecided => Medium -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Tags removed: targetmilestone-inin1410 ** Tags added: targetmilestone-inin1504 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: New Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [sd
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Tags added: kernel-da-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: New Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [sdc] [ 497.573103] : [ 497.573103] 28 [ 497
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: New Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [sdc] [ 497.573103] : [ 497.5731