Re: [PATCH v5 00/21] EEH reorganization
I just hit this on mainline from today (3.4.0-rc2-00065-gf549e08). Haven't had a chance to narrow it down yet. Thanks for the information. I'll try to reproduce the issue on Firebird-L today. By the way, it seems that mstmread is some user-level application accessing the config space while the problem happened? Looking closer, it was caused by an EEH error at boot. It looks like the Mellanox infiniband card gets an error when probed by their firmware tool (mstmread), but only if the kernel driver is not loaded. I see this EEH error back on 3.0, so it's not new. The question now is why we oops in the EEH code on mainline. It seems the crash was caused by something like WARN_ON(). I checked the function pointed by the backtrace (eeh_dn_check_failure) and I didn't find any place has called WARN_ON() staff. Maybe I missed something here. Anyway, I'll try to reproduce it on Firebird-L machine first of all and then narrow it down. Anton Thanks, Gavin [ cut here ] WARNING: at arch/powerpc/platforms/pseries/eeh.c:492 Modules linked in: NIP: c0056cc4 LR: c0056cc0 CTR: c051dd60 REGS: c01f3953f6a0 TRAP: 0700 Not tainted (3.4.0-rc2-00065-gf549e08-dirty) MSR: 80029032 SF,EE,ME,IR,DR,RI CR: 28004482 XER: 000f SOFTE: 0 CFAR: c074ea30 TASK = c01f39685040[19058] 'mstmread' THREAD: c01f3953c000 CPU: 38 GPR00: c0056cc0 c01f3953f920 c0bd3a28 0021 GPR04: 000323f7 GPR08: 6365203c c0b10a20 0002 c0a74cc0 GPR12: 24004422 ceda8500 3a58582e 583a5858 GPR16: 2f585858 69636573 2f646576 10003b48 GPR20: 0fffc7a3d17c 0058 0004 c01f3953fb90 GPR24: c0c77088 c03e6fffeee8 GPR28: c0d82680 c0c770d0 NIP [c0056cc4] .eeh_dn_check_failure+0x304/0x320 LR [c0056cc0] .eeh_dn_check_failure+0x300/0x320 Call Trace: [c01f3953f920] [c0056cc0] .eeh_dn_check_failure+0x300/0x320 (unreliable) [c01f3953f9d0] [c002717c] .rtas_read_config+0x13c/0x1b0 [c01f3953fa70] [c03d543c] .pci_user_read_config_dword+0xcc/0x150 [c01f3953fb20] [c03e19d8] .pci_read_config+0xe8/0x2a0 [c01f3953fc00] [c022d330] .read+0x130/0x210 [c01f3953fce0] [c01a723c] .vfs_read+0xec/0x1e0 [c01f3953fd80] [c01a73ec] .SyS_pread64+0xbc/0xd0 [c01f3953fe30] [c0009780] syscall_exit+0x0/0x7c Instruction dump: 7f83e378 48001909 6000 2fbf 419e002c e89f00d8 2fa4 409e0008 e89f0098 e8629fb8 486f7d39 6000 0fe0 3b21 4bfffdb4 e8829fa8 ---[ end trace a6e6d788c9869e00 ]--- EEH: Detected PCI bus error on device 0006:01:00.0 EEH: This PCI device has failed 1 times in the last hour: EEH: Bus location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0 EEH: Device location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0 EEH: of node=/pci@8002203/pci1014,415@0 EEH: PCI device/vendor: 673c15b3 EEH: PCI cmd/status register: 00100140 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
Hi, Thanks for the information. I'll try to reproduce the issue on Firebird-L today. By the way, it seems that mstmread is some user-level application accessing the config space while the problem happened? The EEH error is caused by the Melanox firmware tools. It seems the crash was caused by something like WARN_ON(). I checked the function pointed by the backtrace (eeh_dn_check_failure) and I didn't find any place has called WARN_ON() staff. Maybe I missed something here. No. I replaced that backtrace in eeh_dn_check_failure with a WARN_ON() because the backtrace doesn't give us enough info. I'm submitting a patch for that today. Bottom line is mstmread has been causing an EEH error since at least 3.0, but in 3.4 we now oops instead of recovering. The signs all point to the EEH rework in 3.4. Anton ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
On Tue, 2012-04-17 at 11:37 +1000, Anton Blanchard wrote: No. I replaced that backtrace in eeh_dn_check_failure with a WARN_ON() because the backtrace doesn't give us enough info. I'm submitting a patch for that today. Bottom line is mstmread has been causing an EEH error since at least 3.0, but in 3.4 we now oops instead of recovering. The signs all point to the EEH rework in 3.4. More precisely, the original oops reported by Anton decodes as such: Oops: Kernel access of bad area, sig: 11 [#1] This is typically a bad memory access.. SMP NR_CPUS=1024 NUMA pSeries Modules linked in: NIP: c0055af8 LR: c0033204 CTR: REGS: c01f42fb7990 TRAP: 0300 Tainted: GW (3.4.0-rc2-00065-gf549e08-dirty) TRAP: 300 means that it's the result of a data access interrupts, ie, load or store to a bad address MSR: 80009032 SF,EE,ME,IR,DR,RI CR: 24008084 XER: SOFTE: 1 CFAR: 49b8 DAR: 0070, DSISR: 4000 Here the DAR tells us what address was accessed. 0x70 is a strong indication that this was an access to a NULL pointer (at offset 0x70 from that pointer). It -might- be something else (such as a NULL passed to a list head or such) but the idea that there's a NULL floating around is a good hint. TASK = c01f6c7dfc40[19010] 'eehd' THREAD: c01f42fb4000 CPU: 6 GPR00: 0001 c01f42fb7c10 c0bd3a28 c01f80ab0800 GPR04: c01f7c57d418 0380 c01f7c57e070 c0ed5360 GPR08: c0c77088 0001 GPR12: 44008088 ceda1500 019ffa78 00a7 GPR16: 00bb c0a9f754 c0963230 005e GPR20: 01b37e80 00bb c0b0ad90 GPR24: c0b10588 0001 c01f80ab0800 GPR28: c01f80ab0828 c01f7ee1 NIP [c0055af8] .eeh_add_device_tree_late+0x58/0xf0 This is the function where it happened (eeh_add_device_tree_late) LR [c0033204] .pcibios_finish_adding_to_bus+0x34/0x50 Call Trace: [c01f42fb7c10] [fdff] 0xfdff (unreliable) [c01f42fb7ca0] [c0033204] .pcibios_finish_adding_to_bus+0x34/0x50 [c01f42fb7d20] [c0059a5c] .pcibios_add_pci_devices+0x7c/0x190 [c01f42fb7db0] [c0057a6c] .eeh_reset_device+0xfc/0x1a0 [c01f42fb7e50] [c0057e18] .handle_eeh_events+0x308/0x480 [c01f42fb7f00] [c00584dc] .eeh_event_handler+0x13c/0x1d0 [c01f42fb7f90] [c002099c] .kernel_thread+0x54/0x70 And your backtrace. You can see that you got an eeh event, which triggered an eeh reset, which triggered a pcibios_add_pci_devices() etc... Instruction dump: 48a8 6000 ebff 7fbfe800 419e0098 2fbf 419e005c e9229eb0 80090008 2f80 419e004c ebdf01d0 e81e0070 7fbf 3160 7d2b0110 Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
Ben, thanks a lot for the backtrace to help narrowing down the root cause. Also thanks a lot for how to parse the backtrace and register staff printed by oops ;-) Finally, I successfully reproduced the issue on Firebird-L machine without loading the corresponding device driver for Emulex ethernet by disable the corresponding config options in .config. With injected config space data parity error destined to the Emulex ethernet MAC, I saw following backtrace. The problem came from following piece of code. Actually, the EEH device should be retrieve from OF node instead of PCI device since the PCI device didn't trace the corresponding EEH device yet at that time. I'll send one patch against it soon even it only need 1 line of code change ;-) (gdb) p (((struct eeh_dev *)0)-pdev) $1 = (struct pci_dev **) 0x70 static void eeh_add_device_late(struct pci_dev *dev) { struct device_node *dn; struct eeh_dev *edev; if (!dev || !eeh_subsystem_enabled) return; dn = pci_device_to_OF_node(dev); edev = pci_dev_to_eeh_dev(dev); edev should be NULL if (edev-pdev == dev) { data access fault here. pr_debug(EEH: Already referenced !\n); return; } WARN_ON(edev-pdev); : : } [ 176.972046] Unable to handle kernel paging request for data at address 0x0070 [ 176.972054] Faulting instruction address: 0xc0055ecc [ 176.972064] Oops: Kernel access of bad area, sig: 11 [#1] [ 176.972070] SMP NR_CPUS=1024 NUMA pSeries [ 176.972078] Modules linked in: [ 176.972086] NIP: c0055ecc LR: c0055ec8 CTR: c005babc [ 176.972102] REGS: c00f4d913970 TRAP: 0300 Not tainted (3.4.0-rc2+) [ 176.972109] MSR: 80009032 SF,EE,ME,IR,DR,RI CR: 2884 XER: 0009 [ 176.972129] SOFTE: 1 [ 176.972133] CFAR: c0005080 [ 176.972138] DAR: 0070, DSISR: 4000 [ 176.972146] TASK = c00f4d8c3600[1038] 'eehd' THREAD: c00f4d91 CPU: 24 [ 176.972155] GPR00: c0055ec8 c00f4d913bf0 c147ed90 001e [ 176.972170] GPR04: [ 176.972183] GPR08: 4f4e450d c0c44208 00036710 00ec [ 176.972197] GPR12: 2882 cff25400 0106c9c8 [ 176.972212] GPR16: 0228 02e5acf0 01aff9a4 0060 [ 176.972227] GPR20: c1345c78 [ 176.972241] GPR24: c1345c70 c0851ac0 [ 176.972256] GPR28: c0a95ad3 c00f529f2c28 c00f529f2c00 c00f4d88 [ 176.972276] NIP [c0055ecc] .eeh_add_device_tree_late+0x17c/0x2c4 [ 176.972286] LR [c0055ec8] .eeh_add_device_tree_late+0x178/0x2c4 [ 176.972294] Call Trace: [ 176.972300] [c00f4d913bf0] [c0055ec8] .eeh_add_device_tree_late+0x178/0x2c4 (unreliable) [ 176.972316] [c00f4d913ca0] [c0036bc8] .pcibios_finish_adding_to_bus+0x74/0x90 [ 176.972328] [c00f4d913d20] [c0059b50] .pcibios_add_pci_devices+0x12c/0x150 [ 176.972339] [c00f4d913db0] [c0057c60] .eeh_reset_device+0x10c/0x140 [ 176.972350] [c00f4d913e50] [c0057ee4] .handle_eeh_events+0x250/0x42c [ 176.972361] [c00f4d913f10] [c0058560] .eeh_event_handler+0xe4/0x178 [ 176.972372] [c00f4d913f90] [c0021550] .kernel_thread+0x54/0x70 [ 176.972380] Instruction dump: [ 176.972384] eb82a1f0 7f83e378 487dd2e9 6000 e862a1f8 7f64db78 487dd2d9 6000 [ 176.972400] eb5f02c0 7f83e378 487dd2c9 6000 e81a0070 7fa0f800 40de0028 e862a188 Thanks, Gavin More precisely, the original oops reported by Anton decodes as such: Oops: Kernel access of bad area, sig: 11 [#1] This is typically a bad memory access.. SMP NR_CPUS=1024 NUMA pSeries Modules linked in: NIP: c0055af8 LR: c0033204 CTR: REGS: c01f42fb7990 TRAP: 0300 Tainted: GW (3.4.0-rc2-00065-gf549e08-dirty) TRAP: 300 means that it's the result of a data access interrupts, ie, load or store to a bad address MSR: 80009032 SF,EE,ME,IR,DR,RI CR: 24008084 XER: SOFTE: 1 CFAR: 49b8 DAR: 0070, DSISR: 4000 Here the DAR tells us what address was accessed. 0x70 is a strong indication that this was an access to a NULL pointer (at offset 0x70 from that pointer). It -might- be something else (such as a NULL passed to a list head or such) but the idea that there's a NULL floating around is a good hint. TASK = c01f6c7dfc40[19010] 'eehd' THREAD: c01f42fb4000 CPU: 6 GPR00: 0001 c01f42fb7c10 c0bd3a28 c01f80ab0800 GPR04: c01f7c57d418 0380 c01f7c57e070 c0ed5360 GPR08: c0c77088
Re: [PATCH v5 00/21] EEH reorganization
Hi Gavin, This series of patches is going to reorganize EEH so that it could support multiple platforms in future. The requirements were raised from the aspects. I just hit this on mainline from today (3.4.0-rc2-00065-gf549e08). Haven't had a chance to narrow it down yet. Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: NIP: c0055af8 LR: c0033204 CTR: REGS: c01f42fb7990 TRAP: 0300 Tainted: GW (3.4.0-rc2-00065-gf549e08-dirty) MSR: 80009032 SF,EE,ME,IR,DR,RI CR: 24008084 XER: SOFTE: 1 CFAR: 49b8 DAR: 0070, DSISR: 4000 TASK = c01f6c7dfc40[19010] 'eehd' THREAD: c01f42fb4000 CPU: 6 GPR00: 0001 c01f42fb7c10 c0bd3a28 c01f80ab0800 GPR04: c01f7c57d418 0380 c01f7c57e070 c0ed5360 GPR08: c0c77088 0001 GPR12: 44008088 ceda1500 019ffa78 00a7 GPR16: 00bb c0a9f754 c0963230 005e GPR20: 01b37e80 00bb c0b0ad90 GPR24: c0b10588 0001 c01f80ab0800 GPR28: c01f80ab0828 c01f7ee1 NIP [c0055af8] .eeh_add_device_tree_late+0x58/0xf0 LR [c0033204] .pcibios_finish_adding_to_bus+0x34/0x50 Call Trace: [c01f42fb7c10] [fdff] 0xfdff (unreliable) [c01f42fb7ca0] [c0033204] .pcibios_finish_adding_to_bus+0x34/0x50 [c01f42fb7d20] [c0059a5c] .pcibios_add_pci_devices+0x7c/0x190 [c01f42fb7db0] [c0057a6c] .eeh_reset_device+0xfc/0x1a0 [c01f42fb7e50] [c0057e18] .handle_eeh_events+0x308/0x480 [c01f42fb7f00] [c00584dc] .eeh_event_handler+0x13c/0x1d0 [c01f42fb7f90] [c002099c] .kernel_thread+0x54/0x70 Instruction dump: 48a8 6000 ebff 7fbfe800 419e0098 2fbf 419e005c e9229eb0 80090008 2f80 419e004c ebdf01d0 e81e0070 7fbf 3160 7d2b0110 Anton ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
Hi, I just hit this on mainline from today (3.4.0-rc2-00065-gf549e08). Haven't had a chance to narrow it down yet. Looking closer, it was caused by an EEH error at boot. It looks like the Mellanox infiniband card gets an error when probed by their firmware tool (mstmread), but only if the kernel driver is not loaded. I see this EEH error back on 3.0, so it's not new. The question now is why we oops in the EEH code on mainline. Anton [ cut here ] WARNING: at arch/powerpc/platforms/pseries/eeh.c:492 Modules linked in: NIP: c0056cc4 LR: c0056cc0 CTR: c051dd60 REGS: c01f3953f6a0 TRAP: 0700 Not tainted (3.4.0-rc2-00065-gf549e08-dirty) MSR: 80029032 SF,EE,ME,IR,DR,RI CR: 28004482 XER: 000f SOFTE: 0 CFAR: c074ea30 TASK = c01f39685040[19058] 'mstmread' THREAD: c01f3953c000 CPU: 38 GPR00: c0056cc0 c01f3953f920 c0bd3a28 0021 GPR04: 000323f7 GPR08: 6365203c c0b10a20 0002 c0a74cc0 GPR12: 24004422 ceda8500 3a58582e 583a5858 GPR16: 2f585858 69636573 2f646576 10003b48 GPR20: 0fffc7a3d17c 0058 0004 c01f3953fb90 GPR24: c0c77088 c03e6fffeee8 GPR28: c0d82680 c0c770d0 NIP [c0056cc4] .eeh_dn_check_failure+0x304/0x320 LR [c0056cc0] .eeh_dn_check_failure+0x300/0x320 Call Trace: [c01f3953f920] [c0056cc0] .eeh_dn_check_failure+0x300/0x320 (unreliable) [c01f3953f9d0] [c002717c] .rtas_read_config+0x13c/0x1b0 [c01f3953fa70] [c03d543c] .pci_user_read_config_dword+0xcc/0x150 [c01f3953fb20] [c03e19d8] .pci_read_config+0xe8/0x2a0 [c01f3953fc00] [c022d330] .read+0x130/0x210 [c01f3953fce0] [c01a723c] .vfs_read+0xec/0x1e0 [c01f3953fd80] [c01a73ec] .SyS_pread64+0xbc/0xd0 [c01f3953fe30] [c0009780] syscall_exit+0x0/0x7c Instruction dump: 7f83e378 48001909 6000 2fbf 419e002c e89f00d8 2fa4 409e0008 e89f0098 e8629fb8 486f7d39 6000 0fe0 3b21 4bfffdb4 e8829fa8 ---[ end trace a6e6d788c9869e00 ]--- EEH: Detected PCI bus error on device 0006:01:00.0 EEH: This PCI device has failed 1 times in the last hour: EEH: Bus location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0 EEH: Device location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0 EEH: of node=/pci@8002203/pci1014,415@0 EEH: PCI device/vendor: 673c15b3 EEH: PCI cmd/status register: 00100140 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
Hi Ben, Could you pls take a look on this when you have time? Thanks, Gavin This series of patches is going to reorganize EEH so that it could support multiple platforms in future. The requirements were raised from the aspects. * The original EEH implementation only support pSeries platform, which would be regarded as guest system. Platform powernv is coming and EEH needs to be supported on powernv as well. * Different platforms might be running based on variable firmware.Further more, the firmware would supply different EEH interfaces to kernel. Therefore, we have to do necessary abstraction on current EEH implementation. In order to accomodate the requirements, the series of patches have reorganized current EEH implementation. * The original implementation looks not clean enough. Necessary cleanup will be done in some of the patches. * struct eeh_ops has been introduced so that EEH core components and platform dependent implementation could be split up. That make it possible for EEH to be supported on multiple platforms. * struct eeh_dev has been introduced to replace struct pci_dn so that EEH module works independently as much as possible. * EEH global statistics will be maintained in a collective fashion. v1 - v2: * If possible, to add eeh_ prefix for function names. * The format of leading function comments won't be changed in order not to break kernel document automatic generation (e.g. by make pdfdocs). * The name of local variables won't be changed if there're no explicit reasons. * Represent the PE's state in bitmap fasion. * Some function names have been adjusted so that they look shorter and meaningful. * Platform operation name has been changed to pseries. * Merge those patches for cleanup if possible. * The line length is kept as appropriately short if possible. * Fixup on alignment spacing issues. v2 - v3: * Split cleanup patch into 2: one for comment cleanup and another one for renaming function names. * Try to use pr_warning/pr_info/pr_debug instead of printk() function call. * Function names are adjusted a little bit so that they looks more meaningful according to comments from Michael/Ben. * Useful comment has been kept according to Michael's comments. * struct eeh_ops::set_eeh has been changed to eeh_ops::set_option. * struct eeh_ops::name has been changed to char *. * Remove file name from the source file. * Copyright (C) format has been changed since (C) isn't encouraged to use. * The header files included in the source file have been sorted alphabetically. * eeh_platform_init() has been replaced by eeh_pseries_init() to avoid duplicate functions when kernel supports multiple platforms. * F/W has been changed to Firmware. * The maximal wait time to retrieve PE's state has been covered by macro. * It also include changes according to the minor comments from Michael. v3 - v4: * Fix some typo included in the commit messages. * Reduce code nesting according to Ram's suggestions. * Addtinal pr_warning on failure of configuring bridges. v4 - v5: * OF node and PCI device are tracing the corresponding eeh device. That has been changed to struct eeh_dev * instead of the original void *. * The conversion between OF node, PCI device, eeh device is changed to inline functions instead of the original macros. * The struct eeh_stats has been moved from eeh.h to eeh.c. Besides, the individual members of the struct have been changed to fixed-type unsigned int. The series of patches (v5) has been verified on Firebird-L machine. In order to carry out the test, you have to install IBM Power Tools from IBM internal yum source. Following command is used to force EEH check on ethernet interface, which could be recovered eventually by EEH and device driver successfully. You could keep pinging to the blade before issuing the following command to force EEH. You should see the network interface can't be reached for a moment and everything will be recovered couple of seconds after the forced EEH error. At the same time, you should see EEH error log out of system console. * errinjct eeh -v -f 0 -p U78AE.001.WZS00M9-P1-C18-L1-T2 -a 0x0 -m 0x0 - arch/powerpc/include/asm/device.h|3 + arch/powerpc/include/asm/eeh.h | 134 +++- arch/powerpc/include/asm/eeh_event.h | 33 +- arch/powerpc/include/asm/ppc-pci.h | 89 +-- arch/powerpc/kernel/of_platform.c|3 + arch/powerpc/kernel/rtas_pci.c |3 + arch/powerpc/platforms/pseries/Makefile |