On 05/18/2012 06:07 AM, Hal Rosenstock wrote:
> On 5/18/2012 2:05 AM, Bob Ciotti wrote:
>>
>>
>> I'm seeing lots of these messages in SM log:
>>
>> May 17 22:36:04 947774 [DA234710] 0x01 -> log_trap_info: Received
>> Generic Notice type:1 num:131 (Flow Control Update watchdog timer
>> expired) Producer:2 (Switch) from LID:444 Port 5 TID:0x0000000000000025
>>
>> the referenced port is a switch to HCA link.
>>
>> I've seen this in cases where there was bad hardware. Spec says failure
>> in flow control machine on other end. But lets assume hardware was good.
>> When could this occur?
>
> Do OperationalVLs match on both sides of the link ? Are you
> using/configuring QoS ?
>
There are two separate fabric on each port of 2 port HCA.
Issue is seen on both fabrics.
Normally we use QoS on both fabrics. QoS now disabled on
ib0 on hca port 1:
r327i7n0 ~ # smpquery portinfo 248 | grep VL
VLCap:...........................VL0-7
VLHighLimit:.....................4
VLArbHighCap:....................8
VLArbLowCap:.....................8
VLStallCount:....................0
OperVLs:.........................VL0-7
r327i7n0 ~ # smpquery -D portinfo 0 1 | grep VL
VLCap:...........................VL0-7
VLHighLimit:.....................4
VLArbHighCap:....................8
VLArbLowCap:.....................8
VLStallCount:....................0
OperVLs:.........................VL0-7
r327i7n0 ~ # smpquery -D portinfo 0,1 1 | grep VL
VLCap:...........................VL0-7
VLHighLimit:.....................4
VLArbHighCap:....................8
VLArbLowCap:.....................8
VLStallCount:....................7
OperVLs:.........................VL0-7
r327i7n0 ~ # ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 2
Firmware version: 2.10.4350
Hardware version: 0
Node GUID: 0x0002c90300336b20
System image GUID: 0x0002c90300336b23
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 248
LMC: 0
SM lid: 1
Capability mask: 0x02514868
Port GUID: 0x0002c90300336b21
Link layer: InfiniBand
Port 2:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 1971
LMC: 0
SM lid: 1685
Capability mask: 0x02514868
Port GUID: 0x0002c90300336b22
Link layer: InfiniBand
r327i7n0 ~ # smpquery -D nodeinfo 0,1 1
# Node info: DR path slid 65535; dlid 65535; 0,1
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Switch
NumPorts:........................36
SystemGuid:......................0x080069000000a4db
Guid:............................0x080069000000a4d8
PortGuid:........................0x080069000000a4d8
PartCap:.........................8
DevId:...........................0xc738
Revision:........................0x000000a1
LocalPort:.......................1
VendorId:........................0x0002c9
r327i7n0 ~ # smpquery -D nodedesc 0,1
Node Description:.SwitchX - Mellanox Technologies
r327i7n0 ~ # smpquery -D sl2vl 0,1 1
# SL2VL table: DR path slid 65535; dlid 65535; 0,1
# SL: | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|
ports: in 0, out 1: | 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
ports: in 1, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 2, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 3, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 4, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 5, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 6, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 7, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 8, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 9, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 10, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 11, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 12, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 13, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 14, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 15, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 16, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 17, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 18, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 19, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 20, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 21, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 22, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 23, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 24, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 25, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 26, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 27, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 28, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 29, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 30, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 31, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 32, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 33, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 34, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 35, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
ports: in 36, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 0| 1| 2| 3| 4| 5| 6| 7|
r327i7n0 ~ # smpquery -D sl2vl 0 1
# SL2VL table: DR path slid 65535; dlid 65535; 0
# SL: | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|
ports: in 0, out 0: | 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
r327i7n0 ~ # smpquery -D vlarb 0,1 1
# VLArbitration tables: DR path slid 65535; dlid 65535; 0,1 port 1 LowCap 8
HighCap 8
# Low priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |
# High priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |0x1 |
r327i7n0 ~ # smpquery -D vlarb 0 1
# VLArbitration tables: DR path slid 65535; dlid 65535; 0 port 1 LowCap 8
HighCap 8
# Low priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x20|0x20|0x20|0x20|0x20|0x20|0x20|0x20|
# High priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x0 |0x0 |0x0 |0x0 |0x0 |0x0 |0x0 |0x0 |
on ib1, HCA port 2, Qos is enabled:
r327i7n0 ~ # smpquery -P2 -D sl2vl 0 2
# SL2VL table: DR path slid 65535; dlid 65535; 0
# SL: | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|
ports: in 0, out 0: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
r327i7n0 ~ # smpquery -P2 -D sl2vl 0,2 1
# SL2VL table: DR path slid 65535; dlid 65535; 0,2
# SL: | 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|
ports: in 0, out 1: | 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
ports: in 1, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 2, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 3, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 4, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 5, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 6, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 7, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 8, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 9, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 10, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 11, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 12, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 13, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 14, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 15, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 16, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 17, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 18, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 19, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 20, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 21, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 22, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 23, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 24, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 25, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 26, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 27, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 28, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 29, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 30, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 31, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 32, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 33, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 34, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 35, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
ports: in 36, out 1: | 0| 1| 2| 3| 4| 5| 6| 7| 3| 4| 5| 6| 7| 3| 4| 5|
r327i7n0 ~ # smpquery -P2 -D vlarb 0,2 1
# VLArbitration tables: DR path slid 65535; dlid 65535; 0,2 port 1 LowCap 8
HighCap 8
# Low priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x0 |0x0 |0x0 |0x40|0x40|0x40|0x40|0x40|
# High priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x0 |0x0 |0x0 |0x0 |0x0 |
WEIGHT: |0x80|0x40|0x40|0x0 |0x0 |0x0 |0x0 |0x0 |
r327i7n0 ~ # smpquery -P2 -D vlarb 0 2
# VLArbitration tables: DR path slid 65535; dlid 65535; 0 port 2 LowCap 8
HighCap 8
# Low priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |
WEIGHT: |0x0 |0x0 |0x0 |0x40|0x40|0x40|0x40|0x40|
# High priority VL Arbitration Table:
VL : |0x0 |0x1 |0x2 |0x0 |0x0 |0x0 |0x0 |0x0 |
WEIGHT: |0x80|0x40|0x40|0x0 |0x0 |0x0 |0x0 |0x0 |
Only in the case of FW bug?
I don't think flow control is performed by FW.
Any tunable's that might impact this?
No IBA standard ones AFAIK. Who's the HCA vendor ?
-- Hal
bob
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html