Hal Rosenstock wrote:
Hi Rob,
On Tue, Nov 25, 2008 at 10:21 AM, Robert Dunkley <[EMAIL PROTECTED]> wrote:
Hi Hal,
Thanks again, I will try this in a minute. I think I have found the
moment it went bad on Machine A using Dmesg:
ib_mthca 0000:87:00.0: Catastrophic error detected: unknown error
Definitely need to reset mthca after this.
ib_mthca 0000:87:00.0: buf[00]: ffffffff
ib_mthca 0000:87:00.0: buf[01]: ffffffff
ib_mthca 0000:87:00.0: buf[02]: ffffffff
ib_mthca 0000:87:00.0: buf[03]: ffffffff
ib_mthca 0000:87:00.0: buf[04]: ffffffff
ib_mthca 0000:87:00.0: buf[05]: ffffffff
ib_mthca 0000:87:00.0: buf[06]: ffffffff
ib_mthca 0000:87:00.0: buf[07]: ffffffff
ib_mthca 0000:87:00.0: buf[08]: ffffffff
ib_mthca 0000:87:00.0: buf[09]: ffffffff
ib_mthca 0000:87:00.0: buf[0a]: ffffffff
ib_mthca 0000:87:00.0: buf[0b]: ffffffff
ib_mthca 0000:87:00.0: buf[0c]: ffffffff
ib_mthca 0000:87:00.0: buf[0d]: ffffffff
ib_mthca 0000:87:00.0: buf[0e]: ffffffff
ib_mthca 0000:87:00.0: buf[0f]: ffffffff
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib0: ib_query_gid() failed
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib0: ib_query_port failed
ib0: Failed to modify QP to ERROR state
ib0: timing out; 1 sends 250 receives not completed
ib0: Failed to modify QP to RESET state
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_CQ failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_CQ failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_SRQ failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
ib_mthca 0000:87:00.0: HW2SW_MPT failed (-11)
Does this help to pinpoint what might have caused this?
The ffffffff in the buf showing you have some PCI bus error. The mthca
driver then moved to error mode and no command will be executed.
I suggest you check that the card has not moved in the system and you
better reboot the system again
Tziporet
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general