On Thu, Sep 2, 2010 at 4:16 PM, Ira Weiny <[email protected]> wrote:
> On Thu, 2 Sep 2010 11:11:13 -0700
> Chuck Hartley <[email protected]> wrote:
>
>> Sure, here is the output:
>> Note this is with the switch we swapped in, so the port numbers don't
>> match the ibchecknet output in the original message.
>>
>> # ibstat
>> CA 'mlx4_0'
>>       CA type: MT26428
>>       Number of ports: 2
>>       Firmware version: 2.6.0
>>       Hardware version: a0
>>       Node GUID: 0x0002c90300032de0
>>       System image GUID: 0x0002c90300032de3
>>       Port 1:
>>               State: Active
>>               Physical state: LinkUp
>>               Rate: 40
>>               Base lid: 6
>>               LMC: 0
>>               SM lid: 6
>
> Well the SM lid is set here.  Is it set on the other nodes?
>
> I don't run ibchecknet usually but I am getting the same errors here on a
> working fabric...
>
> ibwarn: [13629] dump_perfcounters: PortXmitWait not indicated so ignore this 
> counter
> #warn: Lid is not configured lid 37 port 2
> #warn: SM Lid is not configured
> Port check lid 37 port 2:  FAILED
>
> Looking at this output I don't think this is an error.
>
> 13:17:14 > smpquery nodeinfo 37
> # Node info: Lid 37
> BaseVers:........................1
> ClassVers:.......................1
> NodeType:........................Switch
> NumPorts:........................24
> ...
>
> On switch external Ports the Lid and SMLid are not used.
>
> Hal, would you concur?

Yes, on switch external ports, both LID and SMLID are not valid.

-- Hal

>
> Chuck,
> Is it just that IPoIB is not working for you?
>
> Ira
>
>
>>               Capability mask: 0x0251086a
>>               Port GUID: 0x0002c90300032de1
>>       Port 2:
>>               State: Down
>>               Physical state: Polling
>>               Rate: 10
>>               Base lid: 0
>>               LMC: 0
>>               SM lid: 0
>>               Capability mask: 0x02510868
>>               Port GUID: 0x0002c90300032de2
>> CA 'mthca0'
>>       CA type: MT25204
>>       Number of ports: 1
>>       Firmware version: 1.2.0
>>       Hardware version: a0
>>       Node GUID: 0x003048c64c0c0000
>>       System image GUID: 0x003048c64c0c0003
>>       Port 1:
>>               State: Down
>>               Physical state: Polling
>>               Rate: 10
>>               Base lid: 0
>>               LMC: 0
>>               SM lid: 0
>>               Capability mask: 0x02510a68
>>               Port GUID: 0x003048c64c0c0001
>>
>> # iblinkinfo
>> Switch 0x0002c9020041a7a0 Infiniscale-IV Mellanox Technologies:
>>            1    1[  ] ==( 4X 10.0 Gbps Active/  LinkUp)==>       5
>> 1[  ] " HCA-1" ( )
>>            1    2[  ] ==( 4X 10.0 Gbps Active/  LinkUp)==>       6
>> 1[  ] "linux70 HCA-1" ( )
>>            1    3[  ] ==( 4X 10.0 Gbps Active/  LinkUp)==>       7
>> 1[  ] "linux71 HCA-1" ( )
>>            1    4[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1    5[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1    6[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1    7[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1    8[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1    9[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   10[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   11[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   12[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   13[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   14[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   15[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   16[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   17[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   18[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   19[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   20[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   21[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   22[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   23[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   24[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>       9
>> 1[  ] " HCA-1" ( )
>>            1   25[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>       8
>> 1[  ] " HCA-1" ( )
>>            1   26[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   27[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   28[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   29[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   30[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   31[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   32[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   33[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   34[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   35[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>            1   36[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>
>> [  ] "" ( )
>>
>> On Thu, Sep 2, 2010 at 12:03 PM, Ira Weiny <[email protected]> wrote:
>> > On Thu, 2 Sep 2010 06:56:50 -0700
>> > Chuck Hartley <[email protected]> wrote:
>> >
>> >> We swapped in a different switch and see the same errors. The opensm
>> >> logfile does not show any errors:
>> >
>> > Could you run "ibstat" on the node with OpenSM running?
>> >
>> > And "iblinkinfo" on the same node?
>> >
>> > Send that output.
>> >
>> > Ira
>> >
>> >>
>> >> -------------------------------------------------
>> >> OpenSM 3.3.5
>> >> Command Line Arguments:
>> >>  Daemon mode
>> >>  Log File: /var/log/opensm.log
>> >> -------------------------------------------------
>> >> OpenSM 3.3.5
>> >>
>> >> Sep 02 05:56:29 933684 [B53B8700] 0x80 -> OpenSM 3.3.5
>> >> Entering DISCOVERING state
>> >>
>> >> Sep 02 05:56:29 934931 [B53B8700] 0x02 -> osm_vendor_init: 1000
>> >> pending umads specified
>> >> Sep 02 05:56:29 935079 [B53B8700] 0x80 -> Entering DISCOVERING state
>> >> Using default GUID 0x2c90300032de1
>> >> Entering MASTER state
>> >>
>> >> Sep 02 05:56:29 953763 [B53B8700] 0x02 -> osm_vendor_bind: Binding to
>> >> port 0x2c90300032de1
>> >> Sep 02 05:56:29 990146 [B53B8700] 0x02 -> osm_vendor_bind: Binding to
>> >> port 0x2c90300032de1
>> >> Sep 02 05:56:29 990240 [B53B8700] 0x02 -> osm_opensm_bind: Setting
>> >> IS_SM on port 0x0002c90300032de1
>> >> Sep 02 05:56:30 009040 [AF1DB710] 0x80 -> Entering MASTER state
>> >> SUBNET UP
>> >>
>> >> Sep 02 05:56:30 009885 [AF1DB710] 0x02 -> osm_ucast_mgr_process:
>> >> minhop tables configured on all switches
>> >> Sep 02 05:56:30 014593 [AF1DB710] 0x80 -> SUBNET UP
>> >>
>> >>
>> >> On Thu, Sep 2, 2010 at 8:56 AM, Hal Rosenstock <[email protected]> 
>> >> wrote:
>> >> > On Thu, Sep 2, 2010 at 8:34 AM, Chuck Hartley <[email protected]> 
>> >> > wrote:
>> >> >> Hello,
>> >> >>
>> >> >> We installed 1.5.1 and are having problems getting the IB fabric
>> >> >> working. ibv_devinfo shows the HCAs ports are ok and ibdiagnet reports
>> >> >> no errors. However, ibchecknet shows that the switch ports are not
>> >> >> being configured.  We have never seen this before and are at a loss as
>> >> >> to where the problem might be - would someone please point us in the
>> >> >> right direction to look?  Could it be a problem with the switch
>> >> >> itself? Output from ibchecknet below.
>> >> >>
>> >> >>
>> >> >> # ibchecknet
>> >> >> Error check on lid 3 (Infiniscale-IV Mellanox Technologies) port all:  
>> >> >> FAILED
>> >> >> ibwarn: [26732] dump_perfcounters: PortXmitWait not indicated so
>> >> >> ignore this counter
>> >> >> #warn: Lid is not configured lid 3 port 7
>> >> >> #warn: SM Lid is not configured
>> >> >
>> >> > Is there an SM running on your subnet ? If so, I think that the lack
>> >> > of an SM could account for all of the issues mentioned here.
>> >> >
>> >> > -- Hal
>> >> >
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> >> the body of a message to [email protected]
>> >> More majordomo info at  http://**vger.kernel.org/majordomo-info.html
>> >>
>> >
>> >
>> > --
>> > Ira Weiny
>> > Math Programmer/Computer Scientist
>> > Lawrence Livermore National Lab
>> > 925-423-8008
>> > [email protected]
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to [email protected]
>> More majordomo info at  http://*vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> [email protected]
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to