Hi all:

I had one IB cluster with eight IBM HS21 blades, mixed with RHEL5.2 Server
and SLES10 SP2. All of them connected to one IB switch. opensm was running
as subnet manager on one blade. Command ibcheckerrors finished smoothly.
Last week I got another eight IBM LS21 blades connected to another IB
switch. But after I connected two switches and turned on all the IB
adapters on new blades, ibcheckerrors gave error message:

[EMAIL PROTECTED] ~]# ibcheckerrors
#warn: counter RcvErrors = 5691         (threshold 10) lid 3 port 1
Error check on lid 3 (gaia-07 HCA-1) port 1:  FAILED

## Summary: 19 nodes checked, 0 bad nodes found
##          46 ports checked, 1 ports have errors beyond threshold
[EMAIL PROTECTED] ~]# ibv_devinfo
hca_id: mlx4_0
        fw_ver:                         2.3.000
        node_guid:                      0002:c903:0001:3370
        sys_image_guid:                 0002:c903:0001:3373
        vendor_id:                      0x02c9
        vendor_part_id:                 25418
        hw_ver:                         0xA0
        board_id:                       IBM08A0000001
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 15
                        port_lid:               3
                        port_lmc:               0x00

                port:   2
                        state:                  PORT_DOWN (1)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
[EMAIL PROTECTED] ~]# ibcheckport 3 1
[EMAIL PROTECTED] ~]# echo $?
0

I had closed the embeded subnet manager on two IB switches. The issue
always exist, even after I change subnet manager location to another
machine. ib0 of machine gaia-07 can communicate with other machines each
other. All installed IB adapters are ConnectX 4xSDR. Both switches are
Topspin Switches. Will anyone give some advice about this issue? Thanks in
advance!

Wen Hao Wang
Email: [EMAIL PROTECTED]
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to