Re: [E1000-devel] 5% dropped packets under high load with e1000e

Stephen Hemminger Mon, 09 Aug 2010 06:36:05 -0700

On Mon, 09 Aug 2010 13:23:29 +0100
Gavin Hamill <[email protected]> wrote:


> Hi all,
> 
> I'm running Debian Lenny 64-bit (kernel 2.6.26, e1000e version
> 0.3.3.3-k2) on a couple of routers which are dropping an alarming number
> of packets, and wondered if I could consult the list about it?
> 
> The machines are good quality HP DL320 G5's each with two dual-port
> PCI-e x4 82571EB cards, so 4 x 82571EB ports in total on eth0 -> eth3
> (plus 2 on board eth4,eth5 for management)
> 
> eth2 + eth3 are part of 'bond1'
> 
> cor4:~# ethtool -S eth2 | grep -v ": 0"
> NIC statistics:
>      rx_packets: 276763917957
>      tx_packets: 471895773086
>      rx_bytes: 342602999180888
>      tx_bytes: 588662613526252
>      rx_broadcast: 132740615
>      tx_broadcast: 29
>      rx_multicast: 63675224
>      tx_multicast: 1398527
>      rx_errors: 83
>      multicast: 63675224
>      rx_crc_errors: 83
>      rx_missed_errors: 2427853215
>      tx_restart_queue: 200354
>      rx_long_byte_count: 342602999180888
>      rx_csum_offload_good: 276600305105
>      rx_csum_offload_errors: 21438
> 
> rx_missed_errors is rising by thousands of packets per second; the RX
> ring size was 256 bytes - I increased it to 2048 but it has made no
> difference.
> 
> The machine is doing a large amount of network I/O - eth2 + eth3 each
> average 500Mbps/50k pps on both rx+tx.
> 
>  13:06:51 up 424 days, 21:12,  3 users,  load average: 0.05, 0.14, 0.10
> 
> My research so far suggests that rx_missed_errors could be as a result
> of inadequate bus bandwidth... the NICs seem to be configured correctly
> for optimum usage:
> 
> 1269:1523052557 1520343556 1579949975 1579235679 PCI-MSI-edge    eth3
> 1270:4254381790 4256595268 4153628417 4154259718 PCI-MSI-edge    eth2
> 1271: 193042289  193698664  192770825  192359423 PCI-MSI-edge    eth1
> 1272:3932743448 3931616894 3976527168 3976601952 PCI-MSI-edge    eth0
> 
> External link to preserve the ASCII formatting:
> 
> http://194.24.251.10/lspci-vv.txt
> 
> ... from my rudimentary reading, it seems to be OK..  Capabilities: [d0]
> Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
> 
> Interrupts and context-switches are remarkably low:
> 
> cor4:~# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
> id wa
>  0  0     60 1057408 164120 331416    0    0     0     0    0    0  1  4
> 95  0
>  0  0     60 1056608 164120 331440    0    0     0     0 14639 1065  1
> 13 86  0
>  0  0     60 1056956 164120 331440    0    0     0     0 15794 1216  2
> 15 83  0
>  0  0     60 1056980 164120 331440    0    0     0     0 13741  920  2
> 6 92  0
>  0  0     60 1056980 164120 331440    0    0     0     0 18241  953  2
> 9 89  0
>  0  0     60 1056748 164120 331440    0    0     0     0 13219 1374  5
> 15 80  0
> 
> Neither eth0 nor eth1 providing bond0 (much lower traffic) are dropping
> any packets.
> 
> Any advice warmly welcomed! :)
> 
> gdh


How are IRQ's mapped to Cpu?  For best network performance you
have to manually assign SMP affinity for the IRQ's. Alternatively,
irqbalance can try to do this autiomatically.

Look at /proc/cpu_affinity


------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] 5% dropped packets under high load with e1000e

Reply via email to