On Thu, Sep 2, 2010 at 4:16 PM, Ira Weiny <[email protected]> wrote: > On Thu, 2 Sep 2010 11:11:13 -0700 > Chuck Hartley <[email protected]> wrote: > >> Sure, here is the output: >> Note this is with the switch we swapped in, so the port numbers don't >> match the ibchecknet output in the original message. >> >> # ibstat >> CA 'mlx4_0' >> CA type: MT26428 >> Number of ports: 2 >> Firmware version: 2.6.0 >> Hardware version: a0 >> Node GUID: 0x0002c90300032de0 >> System image GUID: 0x0002c90300032de3 >> Port 1: >> State: Active >> Physical state: LinkUp >> Rate: 40 >> Base lid: 6 >> LMC: 0 >> SM lid: 6 > > Well the SM lid is set here. Is it set on the other nodes? > > I don't run ibchecknet usually but I am getting the same errors here on a > working fabric... > > ibwarn: [13629] dump_perfcounters: PortXmitWait not indicated so ignore this > counter > #warn: Lid is not configured lid 37 port 2 > #warn: SM Lid is not configured > Port check lid 37 port 2: FAILED > > Looking at this output I don't think this is an error. > > 13:17:14 > smpquery nodeinfo 37 > # Node info: Lid 37 > BaseVers:........................1 > ClassVers:.......................1 > NodeType:........................Switch > NumPorts:........................24 > ... > > On switch external Ports the Lid and SMLid are not used. > > Hal, would you concur?
Yes, on switch external ports, both LID and SMLID are not valid. -- Hal > > Chuck, > Is it just that IPoIB is not working for you? > > Ira > > >> Capability mask: 0x0251086a >> Port GUID: 0x0002c90300032de1 >> Port 2: >> State: Down >> Physical state: Polling >> Rate: 10 >> Base lid: 0 >> LMC: 0 >> SM lid: 0 >> Capability mask: 0x02510868 >> Port GUID: 0x0002c90300032de2 >> CA 'mthca0' >> CA type: MT25204 >> Number of ports: 1 >> Firmware version: 1.2.0 >> Hardware version: a0 >> Node GUID: 0x003048c64c0c0000 >> System image GUID: 0x003048c64c0c0003 >> Port 1: >> State: Down >> Physical state: Polling >> Rate: 10 >> Base lid: 0 >> LMC: 0 >> SM lid: 0 >> Capability mask: 0x02510a68 >> Port GUID: 0x003048c64c0c0001 >> >> # iblinkinfo >> Switch 0x0002c9020041a7a0 Infiniscale-IV Mellanox Technologies: >> 1 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 5 >> 1[ ] " HCA-1" ( ) >> 1 2[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 6 >> 1[ ] "linux70 HCA-1" ( ) >> 1 3[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 7 >> 1[ ] "linux71 HCA-1" ( ) >> 1 4[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 5[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 6[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 7[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 8[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 9[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 10[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 12[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 14[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 15[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 16[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 17[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 18[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 19[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 20[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 21[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 22[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 23[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 24[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 9 >> 1[ ] " HCA-1" ( ) >> 1 25[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> 8 >> 1[ ] " HCA-1" ( ) >> 1 26[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 27[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 28[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 29[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 30[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 31[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 32[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 33[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 34[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 35[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> 1 36[ ] ==( 4X 2.5 Gbps Down/ Polling)==> >> [ ] "" ( ) >> >> On Thu, Sep 2, 2010 at 12:03 PM, Ira Weiny <[email protected]> wrote: >> > On Thu, 2 Sep 2010 06:56:50 -0700 >> > Chuck Hartley <[email protected]> wrote: >> > >> >> We swapped in a different switch and see the same errors. The opensm >> >> logfile does not show any errors: >> > >> > Could you run "ibstat" on the node with OpenSM running? >> > >> > And "iblinkinfo" on the same node? >> > >> > Send that output. >> > >> > Ira >> > >> >> >> >> ------------------------------------------------- >> >> OpenSM 3.3.5 >> >> Command Line Arguments: >> >> Daemon mode >> >> Log File: /var/log/opensm.log >> >> ------------------------------------------------- >> >> OpenSM 3.3.5 >> >> >> >> Sep 02 05:56:29 933684 [B53B8700] 0x80 -> OpenSM 3.3.5 >> >> Entering DISCOVERING state >> >> >> >> Sep 02 05:56:29 934931 [B53B8700] 0x02 -> osm_vendor_init: 1000 >> >> pending umads specified >> >> Sep 02 05:56:29 935079 [B53B8700] 0x80 -> Entering DISCOVERING state >> >> Using default GUID 0x2c90300032de1 >> >> Entering MASTER state >> >> >> >> Sep 02 05:56:29 953763 [B53B8700] 0x02 -> osm_vendor_bind: Binding to >> >> port 0x2c90300032de1 >> >> Sep 02 05:56:29 990146 [B53B8700] 0x02 -> osm_vendor_bind: Binding to >> >> port 0x2c90300032de1 >> >> Sep 02 05:56:29 990240 [B53B8700] 0x02 -> osm_opensm_bind: Setting >> >> IS_SM on port 0x0002c90300032de1 >> >> Sep 02 05:56:30 009040 [AF1DB710] 0x80 -> Entering MASTER state >> >> SUBNET UP >> >> >> >> Sep 02 05:56:30 009885 [AF1DB710] 0x02 -> osm_ucast_mgr_process: >> >> minhop tables configured on all switches >> >> Sep 02 05:56:30 014593 [AF1DB710] 0x80 -> SUBNET UP >> >> >> >> >> >> On Thu, Sep 2, 2010 at 8:56 AM, Hal Rosenstock <[email protected]> >> >> wrote: >> >> > On Thu, Sep 2, 2010 at 8:34 AM, Chuck Hartley <[email protected]> >> >> > wrote: >> >> >> Hello, >> >> >> >> >> >> We installed 1.5.1 and are having problems getting the IB fabric >> >> >> working. ibv_devinfo shows the HCAs ports are ok and ibdiagnet reports >> >> >> no errors. However, ibchecknet shows that the switch ports are not >> >> >> being configured. We have never seen this before and are at a loss as >> >> >> to where the problem might be - would someone please point us in the >> >> >> right direction to look? Could it be a problem with the switch >> >> >> itself? Output from ibchecknet below. >> >> >> >> >> >> >> >> >> # ibchecknet >> >> >> Error check on lid 3 (Infiniscale-IV Mellanox Technologies) port all: >> >> >> FAILED >> >> >> ibwarn: [26732] dump_perfcounters: PortXmitWait not indicated so >> >> >> ignore this counter >> >> >> #warn: Lid is not configured lid 3 port 7 >> >> >> #warn: SM Lid is not configured >> >> > >> >> > Is there an SM running on your subnet ? If so, I think that the lack >> >> > of an SM could account for all of the issues mentioned here. >> >> > >> >> > -- Hal >> >> > >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> >> the body of a message to [email protected] >> >> More majordomo info at http://**vger.kernel.org/majordomo-info.html >> >> >> > >> > >> > -- >> > Ira Weiny >> > Math Programmer/Computer Scientist >> > Lawrence Livermore National Lab >> > 925-423-8008 >> > [email protected] >> > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to [email protected] >> More majordomo info at http://*vger.kernel.org/majordomo-info.html >> > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > [email protected] > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
