Yeah, did a reboot.  I verified the modules weren't loaded (lsmod), and then modprobed ib_mthca.  The same errors that I was seeing during startup were dropped to screen:

p5l1:~# lsmod
Module                  Size  Used by
p5l1:~# modprobe ib_mthca
[599947.213712] ib_mthca: Mellanox InfiniBand HCA driver v0.06 (June 23, 2005)
[599947.213732] ib_mthca: Initializing Mellanox Technologies MT23108 InfiniHost (0001:c1:00.0)
[599948.488315] EEH: MMIO failure (2) on device: pci15b3,5a44 /[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/pci15b3,[EMAIL PROTECTED]
[599948.488343] Call Trace:
[599948.488351] [c00000000f02b050] [c00000000002fc80] .eeh_dn_check_failure+0x2bc/0x314 (unreliable)
[599948.488380] [c00000000f02b130] [c00000000002fdd4] .eeh_check_failure+0xfc/0x190
[599948.488425] [c00000000f02b1c0] [d0000000005f37cc] .mthca_cmd_poll+0x120/0x258 [ib_mthca]
[599948.488469] [c00000000f02b290] [d0000000005f3cc8] .mthca_cmd_box+0x90/0xa8 [ib_mthca]
[599948.488516] [c00000000f02b330] [d0000000005f5444] .mthca_INIT_HCA+0x240/0x288 [ib_mthca]
[599948.488561] [c00000000f02b3e0] [d0000000005f2790] .mthca_init_one+0xd2c/0x180c [ib_mthca]
[599948.488600] [c00000000f02b870] [c0000000001d4a2c] .pci_device_probe+0xac/0xdc
[599948.488622] [c00000000f02b900] [c000000000239ec0] .driver_probe_device+0x80/0x15c
[599948.488647] [c00000000f02b990] [c00000000023a130] .__driver_attach+0xa8/0xc4
[599948.488669] [c00000000f02ba20] [c0000000002390d4] .bus_for_each_dev+0x78/0xcc
[599948.488699] [c00000000f02bad0] [c00000000023a174] .driver_attach+0x28/0x40
[599948.488718] [c00000000f02bb50] [c000000000239848] .bus_add_driver+0xc8/0x1dc
[599948.488751] [c00000000f02bc00] [c00000000023a7b0] .driver_register+0x44/0x5c
[599948.488771] [c00000000f02bc90] [c0000000001d46e4] .pci_register_driver+0x84/0xd8
[599948.488808] [c00000000f02bd10] [d000000000607594] .mthca_init+0x1c/0x48 [ib_mthca]
[599948.488857] [c00000000f02bd90] [c00000000006cc88] .sys_init_module+0x2f0/0x4cc
[599948.488885] [c00000000f02be30] [c00000000000d300] syscall_exit+0x0/0x18
[599948.488914] EEH: MMIO failure (2), notifiying device 0001:c1:00.0 Mellanox Technologies MT23108 InfiniHost
[599948.488986] ib_mthca 0001:c1:00.0: HCA FW version 3.2.0 is old (3.3.3 is current).
[599948.489002] ib_mthca 0001:c1:00.0: If you have problems, try updating your HCA FW.
[599948.490093] ib_mthca 0001:c1:00.0: SW2HW_MPT returned status 0x01
[599948.490107] ib_mthca 0001:c1:00.0: Failed to create driver PD, aborting.
[599948.492268] ib_mthca: probe of 0001:c1:00.0 failed with error -22


This is on an OpenPower 720...

Thaddeus


On 9/22/05, Pradeep Satyanarayana <[EMAIL PROTECTED]> wrote:

Adding ib_mthca to /etc/hotplug/blacklist worked for us (i.e. it is the workaround we adopted). Just to double check, you did reboot after adding to the blaclkist and then loaded ib_mthca after reboot -right?

BTW, what kind of Power5 machine are you using?

Pradeep
[EMAIL PROTECTED]
Inactive hide details for Thaddeus Ternes <[EMAIL PROTECTED]>Thaddeus Ternes < [EMAIL PROTECTED]>


          Thaddeus Ternes <[EMAIL PROTECTED]>

          09/22/2005 01:42 PM

          Please respond to
          Thaddeus Ternes


To

Roland Dreier <[EMAIL PROTECTED]>

cc

Pradeep Satyanarayana/Beaverton/[EMAIL PROTECTED], [email protected]

Subject

Re: [openib-general] EEH: MMIO Failure on Power5


Yeah, same result as before.

On 9/22/05, Roland Dreier <[EMAIL PROTECTED]> wrote:
>     Thaddeus> These are OpenPower 720 machines.  I've been away from
>     Thaddeus> the office for a few days, so I'll do some more poking
>     Thaddeus> around to see if I can come up with anything else.
>     Thaddeus> Maybe I've missed something in the logs or dmesg...
>
> Have you tried the workaround of adding 'ib_mthca' to /etc/hotplug/blacklist
> and then loading the module after the system is fully booted?
>
>  - R.
>



_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to