Pawel Dziekonski wrote:
Hi,
today I got the following:
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [20876] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21040] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21050] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21137] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21246] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21256] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21343] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21431] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21441] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21528] madrpc_init: client_register for mgmt 1 failed:
(Cannot allocate memory)
kernel: NETDEV WATCHDOG: ib0: transmit timed out
kernel: ib0: transmit timeout: latency 9037 msecs
kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213
kernel: NETDEV WATCHDOG: ib0: transmit timed out
kernel: ib0: transmit timeout: latency 10087 msecs
kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213
Seems that IB connection is lost completely (lustre not working, can't ping
remote IPoIB addresses etc.).
Is it a hardware problem with IB iface or something else?
regards, P
# ibv_devinfo
hca_id: mthca0
fw_ver: 1.2.0
node_guid: 0030:487e:0c06:0000
sys_image_guid: 0030:487e:0c06:0003
vendor_id: 0x02c9
vendor_part_id: 25204
hw_ver: 0xA0
board_id: SM_0000000003
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 448
port_lmc: 0x00
It seems HW issue but to ensure it can you see if it also happened with
OFED 1.4?
If yes please approach Mellanox support since you might need a FW update
Tziporet
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general