Hi Don,

On 5/1/2013 3:40 AM, Skidmore, Donald C wrote:
Hi Nishit,

The rx_no_dma_resources means we are dropping packets because we don't have any 
free descriptors in the RX Queue.  While rx_missed_errors are due to 
insufficient space to store an ingress packet, so basically we ran out of 
buffers or bandwidth on the PCIe bus.

Thanks for the explanation. Is there any way to verify whether we are running out of buffers or bandwidth on the PCIe bus ?
                (ixgbe loading shows PCIe 5.0GT/s:Width x8)

        # dmesg | grep "0000:06:00"
        ixgbe 0000:06:00.0: PCI->APIC IRQ transform: INT C -> IRQ 18
        ixgbe 0000:06:00.0: setting latency timer to 64
        ixgbe 0000:06:00.0: irq 40 for MSI/MSI-X
        ixgbe 0000:06:00.0: irq 41 for MSI/MSI-X
        ixgbe 0000:06:00.0: irq 42 for MSI/MSI-X
ixgbe 0000:06:00.0: (PCI Express:5.0GT/s:Width x8) 00:90:fb:45:f1:76
        ixgbe 0000:06:00.0: eth0: MAC: 2, PHY: 14, SFP+: 5, PBA No:
ixgbe 0000:06:00.0: eth0: Enabled Features: RxQ: 2 TxQ: 2 FdirHash RSS
        ixgbe 0000:06:00.0: eth0: Intel(R) 10 Gigabit Network Connection
        ixgbe 0000:06:00.1: PCI->APIC IRQ transform: INT D -> IRQ 19
        ixgbe 0000:06:00.1: setting latency timer to 64
        ixgbe 0000:06:00.1: irq 43 for MSI/MSI-X
        ixgbe 0000:06:00.1: irq 44 for MSI/MSI-X
        ixgbe 0000:06:00.1: irq 45 for MSI/MSI-X
ixgbe 0000:06:00.1: (PCI Express:5.0GT/s:Width x8) 00:90:fb:45:f1:77
        ixgbe 0000:06:00.1: eth1: MAC: 2, PHY: 14, SFP+: 6, PBA No:
ixgbe 0000:06:00.1: eth1: Enabled Features: RxQ: 2 TxQ: 2 FdirHash RSS
        ixgbe 0000:06:00.1: eth1: Intel(R) 10 Gigabit Network Connection

    One another interesting observation.
When we have changed the packet buffer from 512 KB to 128 KB (by changing rx_pb_size in ixgbe_82599.c), per packet latency is reduced from 500 microseconds to 100 microseconds for 64 bytes packets.
    Does it mean some kind of relation with size of packet buffer ?



All that said when you see the  rx_no_dma_resources errors is their rate 
comparable with what you were seeing for rx_missed_errors?  Both will lead to 
the same thing, dropped packets.

I have find out the frame size from where we are getting rx_missed_errors and below is a rate at those sizes.

    frame size 110 bytes        (avg. latency 45 microseconds)
        - no rx_missed_errors.
        - rx_no_dma_resources increase rate is 8200000/sec

    frame size 108 bytes        (avg. latency 345 microseconds)
        - rx_missed_errors increase rate is 207000/sec
        - rx_no_dma_resources increase rate is 8300000/sec


Also what does 'lspci -vvv' show, I'm looking to see if you are getting the 
full PCIe bandwidth.  You could also try to turn on FC which should lower these 
types of overflow occurrences.

Enabling flow control is even increasing the latency. Seems to be tester machine is not understanding the PAUSE frames and FC also clears the DROP_EN bit that is again increasing the latency.
     lspci -vvv output is attached with the mail.


Thanks,
-Don Skidmore<[email protected]>

Rgds,
Nishit Shah.

-----Original Message-----
From: Nishit Shah [mailto:[email protected]]
Sent: Tuesday, April 30, 2013 9:07 AM
To: [email protected]
Subject: [E1000-devel] 82599 latency increase with rx_missed_errors


Hi,

      We are measuring packet latencies at various packet sizes (64 bytes to
1518 bytes) with 82599 card with ixgbe driver 3.7.21.

Setup:

      Spirent test center sender                       machine with 82599
(ixgbe 3.7.21 and vanilla 2.6.39.4)        Spirent test center receiver

          10 G<------------------------>           10G
10G<------------------------------>         10G

      When we don't have an increase in "rx_missed_errors" and
"rx_no_dma_resources", we are getting per packet latency around 40-70
microseconds. ("rx_no_buffer_count" is not increasing)
      When we have an increase in "rx_no_dma_resources", we are still getting
per packet latency around 40-70 microseconds.
("rx_no_buffer_count" is not increasing)
      When we have an increase in "rx_missed_errors", we are getting per
packet latency around 500 microseconds. (rx_no_buffer_count is not
increasing)

      Is there any specific reason for latency increase when "rx_missed_errors"
are increased ?
      Is there a way to control it ?

      Below is a machine detail.
==========================================================
===============================================
Machine details.

      CPU:            Dual Core Intel(R) Celeron(R) CPU G540 @ 2.50GHz
      Memory:       2 GB
       kernel:        vanilla 2.6.39.4
      Interface tuning parameters:
                  Auto Negotiation is off    (DROP_EN is set.)
                  ethtool -G eth0 rx 64 tx 128 ; ethtool -G eth1 rx 64 tx 128
                  rx-usecs is set to 50.
      ethtool and lspci for bus information:

          # ethtool -i eth0
          driver: ixgbe
          version: 3.7.21-NAPI
          firmware-version: 0x80000345
          bus-info: 0000:06:00.0
          #
          # ethtool -i eth1
          driver: ixgbe
          version: 3.7.21-NAPI
          firmware-version: 0x80000345
          bus-info: 0000:06:00.1

          06:00.0 Class 0200: Device 8086:10fb (rev 01)
              Subsystem: Device 15bb:30e0
              Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
  >TAbort-<TAbort-<MAbort->SERR-<PERR- INTx-
              Latency: 0, Cache Line Size: 64 bytes
              Interrupt: pin A routed to IRQ 18
              Region 0: Memory at f7520000 (64-bit, non-prefetchable) 
[size=128K]
              Region 2: I/O ports at 8020 [size=32]
              Region 4: Memory at f7544000 (64-bit, non-prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
              Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                  Address: 0000000000000000  Data: 0000
                  Masking: 00000000  Pending: 00000000
              Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
                  Vector table: BAR=4 offset=00000000
                  PBA: BAR=4 offset=00002000
              Capabilities: [a0] Express (v2) Endpoint, MSI 00
                  DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s<512ns, 
L1
<64us
                      ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                  DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                      RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                      MaxPayload 128 bytes, MaxReadReq 512 bytes
                  DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
AuxPwr- TransPend-
                  LnkCap: Port #1, Speed 5GT/s, Width x8, ASPM L0s, Latency 
L0<2us,
L1<32us
                      ClockPM- Surprise- LLActRep- BwNot-
                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
CommClk-
                      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                  LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
                  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-,
Selectable De-emphasis: -6dB
                       Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                       Compliance De-emphasis: -6dB
                  LnkSta2: Current De-emphasis Level: -6dB
              Capabilities: [e0] Vital Product Data
                  Unknown small resource type 06, will not decode more.
              Kernel driver in use: ixgbe
              Kernel modules: ixgbe

          06:00.1 Class 0200: Device 8086:10fb (rev 01)
              Subsystem: Device 15bb:30e0
              Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
              Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
  >TAbort-<TAbort-<MAbort->SERR-<PERR- INTx-
              Latency: 0, Cache Line Size: 64 bytes
              Interrupt: pin B routed to IRQ 19
              Region 0: Memory at f7500000 (64-bit, non-prefetchable) 
[size=128K]
              Region 2: I/O ports at 8000 [size=32]
              Region 4: Memory at f7540000 (64-bit, non-prefetchable) [size=16K]
              Capabilities: [40] Power Management version 3
                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
              Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                  Address: 0000000000000000  Data: 0000
                  Masking: 00000000  Pending: 00000000
              Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
                  Vector table: BAR=4 offset=00000000
                  PBA: BAR=4 offset=00002000
              Capabilities: [a0] Express (v2) Endpoint, MSI 00
                  DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s<512ns, 
L1
<64us
                      ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                  DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                      RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                      MaxPayload 128 bytes, MaxReadReq 512 bytes
                  DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
AuxPwr- TransPend-
                  LnkCap: Port #1, Speed 5GT/s, Width x8, ASPM L0s, Latency 
L0<2us,
L1<32us
                      ClockPM- Surprise- LLActRep- BwNot-
                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
CommClk-
                      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                  LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
                  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
                  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                  LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- 
SpeedDis-,
Selectable De-emphasis: -6dB
                       Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                       Compliance De-emphasis: -6dB
                  LnkSta2: Current De-emphasis Level: -6dB
              Capabilities: [e0] Vital Product Data
                  Unknown small resource type 06, will not decode more.
              Kernel driver in use: ixgbe
              Kernel modules: ixgbe
==========================================================
===============================================

Rgds,
Nishit Shah.

------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get
100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with<2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit
http://communities.intel.com/community/wired

------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to