Pawel Dziekonski wrote:
Hi,

today I got the following:

kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [20876] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21040] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21050] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21137] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21246] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21256] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21343] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21431] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21441] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: ib_mthca 0000:06:00.0: SW2HW_MPT failed (-11)
perfquery: ibpanic: [21528] madrpc_init: client_register for mgmt 1 failed: 
(Cannot allocate memory)
kernel: NETDEV WATCHDOG: ib0: transmit timed out
kernel: ib0: transmit timeout: latency 9037 msecs
kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213
kernel: NETDEV WATCHDOG: ib0: transmit timed out
kernel: ib0: transmit timeout: latency 10087 msecs
kernel: ib0: queue stopped 1, tx_head 35341, tx_tail 35213

Seems that IB connection is lost completely (lustre not working, can't ping
remote IPoIB addresses etc.).

Is it a hardware problem with IB iface or something else?

regards, P

# ibv_devinfo
hca_id: mthca0
        fw_ver:                 1.2.0
        node_guid:                      0030:487e:0c06:0000
        sys_image_guid:                 0030:487e:0c06:0003
        vendor_id:                      0x02c9
        vendor_part_id:                 25204
        hw_ver:                 0xA0
        board_id:               SM_0000000003
        phys_port_cnt:                  1
                port:   1
                  state:                        PORT_INIT (2)
                  max_mtu:              2048 (4)
                  active_mtu:           2048 (4)
                  sm_lid:                       1
                  port_lid:             448
                  port_lmc:             0x00



It seems HW issue but to ensure it can you see if it also happened with OFED 1.4?
If yes please approach Mellanox support  since you might need a FW update

Tziporet


_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to